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1.0  Introduction 


The  Multivariate  Optimum  Interpolation 
A  ilysis  of  Meteorological  Data  at  the 
Fleet  Numerical  Oceanography  Center 


In  simplest  terms,  the  forecast  skill  of  an  automated  atmospheric 
prediction  system  is  dependent  upon  two  factors:  (1)  the  ability  of  the 
forecast  model  to  realistically  depict  actual  atmospheric  processes;  and 
(2)  the  ability  to  provide  the  model  with  initial  conditions  that  reflect  the 
true  state  of  the  atmosphere  at  scales  that  are  resolvable  by  the  model. 
The  first  factor  is  strictly  dependent  upon  the  formulation  of  the  forecast 
model  itself,  while  the  second  factor  is  dependent  upon  all  components 
of  the  data  assimilation  system.  To  produce  the  best  initial  conditions 
for  their  forecast  models,  all  operational  forecast  centers  continually 
run  a  four-dimensional  data  assimilation  cycle  that  generally  consists 
of  four  components:  data  quality  control,  data  analysis,  model  initialization, 
and  the  forecast  model  itself. 

While  the  components  of  the  data  assimilation  system  each  perform 
a  separate  function,  they  are  inextricably  connected.  Given  an  estimate 
of  the  current  state  of  the  atmosphere,  the  analysis  must  somehow  update 
this  information  using  current  observations.  A  short-range  forecast, 
initialized  from  the  previous  analysis  and  valid  at  the  current  analysis 
time,  provides  first-guess,  or  background,  fields  for  the  current 
analysis.  This  analysis,  in  turn,  provides  the  initial  fields  for  the  next 
forecast;  hence,  the  nomenclature  data  assimilation  cycle.  The  forecast 
model  serves  as  the  integrator  of  past  observations,  and  the  analysis 
serves  as  the  assimilator  of  current  observations.  Effective  quality  control 
of  the  observational  data  is  vital  prior  to  the  analysis,  and  even  during  the 
analysis  itself,  to  prevent  erroneous  observations  from  contaminating 
the  results.  Following  the  analysis,  the  initialization  step  adjusts  the 
analyzed  fields  prior  to  their  use  as  initial  conditions  by  the  forecast 
model,  thereby  removing  gravity-wave  noise  that  can  contaminate  the 
first  few  forecast  hours.  If  the  analysis  is  performing  as  an  effective 
component  of  the  data  assimilation  system,  the  analyzed  fields  will 
closely  reflect  the  information  contained  in  the  good  observations  and 
the  initialization  will  make  only  small  adjustments  to  these  fields. 

Over  the  past  35  years,  the  automated  analysis  of  meteorological 
data  has  evolved  from  the  relatively  simple  objective  analysis  techniques 
introduced  by  Bergthdrssen  and  Dbbs  (1955)  and  elaborated  upon  by 
Cressman  (1959)  to  the  more  sophisticated  statistical  techniques  of  the 
optimum  interpolation  (OI)  methodology  first  described  by  Gandin  (1963). 
Although  the  operational  implementation  of  OI  techniques  lagged  this 
early  theoretical  development  by  over  a  decade,  OI  is  now  used  for 
meteorological  data  analysis  by  many  of  the  world’s  operational 
forecasting  centers  (Rutherford,  1976;  McPherson  et  al.,  1979;  Lorenc, 
1981;  Mills  and  Seaman,  1990).  Comparisons  of  some  of  these  analysis 
systems  are  given  by  Daley  et  al.  (1 985)  and  Carr  ( 1 987).  The  evolution 
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of  the  U.S.  Navy’s  operational  analysis  system  has  followed  a  similar 
course.  The  original  successive  corrections  analysis  (Barker,  1982)  was 
replaced  by  a  global  multivariate  optimum  interpolation  (MVOI)  analysis 
in  1988  (Barker,  1992;  Goerss  and  Phoebus,  1992).  The  MVOI  was 
developed  by  the  Marine  Meteorology  Division  of  the  Naval  Research 
Laboratory  (NRL),  and  is  run  operationally  by  the  Navy’s  Fleet  Numerical 
Oceanography  Center  (FNOC)  in  Monterey,  California. 

The  MVOI  is  an  integral  part  of  the  Navy  Operational  Global 
Atmospheric  Prediction  System  (NOGAPS),  which  basically  consists  of 
the  four  components  described  above.  Along  with  the  changes  to  the 
analysis,  major  upgrades  were  made  to  the  other  system  components  in 
1988,  as  well.  The  global  grid  point  model  (Rosmond,  1981)  was  replaced 
with  a  spectral  forecast  model,  and  the  variational  initialization  was 
replaced  by  a  normal  mode  initialization.  These  components  are  described 
in  detail  by  Hogan  et  al.  (1991).  Continuous  research  and  experimentation 
since  that  time  have  resulted  in  further  improvements  to  the  various 
NOGAPS  components.  The  preliminary  quality  control  of  the  observations 
is  described  by  Baker  (1991),  while  advances  in  the  initialization  and 
the  forecast  model  are  discussed  by  Rosmond  (1992). 

This  report  describes  the  Navy’s  operational  atmospheric  analysis 
system  and  documents  the  status  of  that  system  as  it  existed  in  April  1991. 
The  global  analysis  is  produced  for  16  pressure  levels  (the  15  standard 
levels  from  1000  mb  to  10  mb  plus  925  mb)  on  the  Gaussian  grid  of 
the  NOGAPS  spectral  forecast  model.  The  Gaussian  grid  points  are 
separated  by  1.5°  in  the  east-west  direction  and  by  approximately  1.5° 
in  the  north-south  direction.  The  analyzed  meteorological  variables  are 
geopotential  height  and  the  u-  and  rewind  components.  A  complete 
description  of  the  techniques  and  algorithms  used  in  the  MVOI  analysis 
is  given  in  Section  2.0,  with  an  emphasis  on  the  global  application  of 
that  analysis.  Section  3.0  provides  a  summary  of  the  global  meteorological 
database  and  discusses  how  the  data  are  prepared  for  the  NOGAPS 
analysis.  The  analysis  process  itself  is  described  in  detail  in  Section  4.0. 
Since  the  MVOI  was  developed  so  that  with  some  modification  it  can 
be  used  to  satisfy  other  Navy  requirements,  we  describe  these  other 
analysis  applications  and  their  associated  modifications  to  the  MVOI 
in  Section  5.0.  Finally,  we  will  summarize  improvements  made,  to  date, 
and  discuss  our  plans  for  future  upgrades  to  the  data  assimilation 
system. 


2.0  Multivariate 
Optimum  Interpolation 
Analysis 
2.1  Theory 


In  a  data  assimilation  cycle,  the  data  analysis  step  provides  a  way  of 
updating  a  first-guess,  or  background,  field  using  data  that  has  become 
available  since  the  last  analysis  was  made.  The  background  field  is 
usually  a  model  forecast,  initialized  from  the  previous  analysis,  that 
is  valid  at  the  synoptic  time  nearest  the  new  observations.  Since  the 
data  analysis  is  actually  an  estimate  of  the  change  in  the  background 
field,  it  typically  involves  the  computation  of  weighted  sums  of  data 
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increments,  valid  at  fixed  grid  points  over  the  area  of  interest.  The 
data  increments  are  formed  by  interpolating  the  background  field  to 
the  locations  of  observations  and  taking  the  differences  between  the 
observed  values  and  the  interpolated  background  values.  The  analyzed 
increments  are  then  added  to  the  background  values  at  each  analysis 
point  to  produce  the  new  estimate  of  the  field.  In  mathematical 
notation, 


/£»#+£  "k/f.  (i) 

i  =  1 

where  F  represents  a  full-field  variable  and  /  represents  an  incremental 
variable.  The  superscripts  a ,  p,  and  o  distinguish  analyzed,  predicted 
(background),  and  observed  values,  respectively.  The  subscript 
k  indicates  an  analysis  grid  point  location,  while  the  subscript  i  represents 
an  observation  location.  Finally  w,*  is  the  weight  associated  with  the 
observed  increment  f°  for  the  analysis  at  grid  point  k.  What  is  unique 
to  the  01  methodology  is  the  way  the  weights  w ik  are  computed. 

The  01  technique  utilizes  observed  data  to  compute  increments  to  the 
background  field  in  such  a  way  that  the  mean  squared  error  of  the 
analysis  will  be  minimized  if  computed  over  a  statistically  significant 
number  of  cases.  There  are  some  limitations  to  this  optimization,  however, 
due  to  the  various  assumptions  that  must  be  made,  and  due  to  the 
inexact  determination  of  the  statistical  parameters  that  must  be  specified. 
These  limitations  will  become  more  apparent  as  the  discussion  continues. 
The  theoretical  development  of  OI  is  well  documented  elsewhere 
(Bergman,  1979;  Lorenc,  1981),  but  for  completeness,  descriptions  of 
the  basic  equations  will  be  presented  in  this  section. 

Let  Fi  represent  the  true  value  of  some  variable  at  a  given  location. 
While  we  would  like  to  know  Fk  exactly,  at  best  we  can  only  estimate 
it  from  the  available  sources  of  information.  The  predicted  value  Fj[ 
provides  one  such  estimate.  Through  the  model  forecast,  we  have 
information  carried  along  from  past  observations,  modified  and  controlled 
by  the  physical  and  dynamical  constraints  of  the  model  itself.  From  the 
nearby  observations,  the  Fj\  we  have  estimates  of  the  current  state  of 
the  atmosphere.  The  objective  analysis  scheme  combines  these  various 
sources  of  information  to  produce  another  estimate  of  Flk,  which  we  have 
designated  Fk.  None  of  these  estimates  is  likely  to  exactly  represent 
Fk,  since  each  estimate  has  its  own  inherent  sources  of  error. 

Thus,  we  can  denote  the  actual  error  associated  with  each  individual 
value  as 


Analysis  Error:  Ea  =  F*  -  F[ 

(2) 

Observation  Error:  E°  =  Ff  -  F\ 

(3) 

Prediction  Error:  Ep  =  F£  -  F[  . 

(4) 
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In  this  discussion,  we  assume  that  the  expected  value  (or  mean)  of 
these  errors  is  zero.  Given  a  large  ensemble  of  similar  realizations,  we 
can  determine  the  following  statistical  error  estimates 


Analysis  Error  Estimate  £a=((£'a)2) 

(5) 

Observation  Error  Estimate  E°=-([e°Y ) 

(6) 

Prediction  Error  Estimate  Ep=({ep)2)  2  , 

(7) 

where  the  angle  brackets  indicate  the  ensemble  average.  We  can  then 
normalize  the  various  errors  by  these  error  estimates.  Thus,  in 
dimensionless  form  we  have 


ep  =  Ep/Ep 

(8) 

e°  =  E°IE° 

(9) 

ea  =  Ea/Ea. 

(10) 

Similarly,  the  observation  error  estimate  and  analysis  error  estimate  are 
normalized  by  the  prediction  error  estimate  giving 


=  E°/Ep 

(ID 

=  Ea/Ep . 

(12) 

Since  we  work  only  with  increments  from  the  background  field,  the 
observed  increment  and  the  analyzed  increment  will  be  represented  in 
dimensionless  form  by 


(13) 


(14) 


Then,  the  analysis  at  a  specific  point  k,  for  a  set  of  N  observations,  may 
be  written  in  the  general  form  shown  in  equation  (1).  The  analyzed 
increment  /£  at  point  k  is  given  by  the  summation 


N 


/  =1 


(15) 
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By  substituting  equations  (13)  and  (14)  into  equation  (1),  squaring, 
and  taking  the  ensemble  average,  we  can  derive  the  expression  for  the 
error  resulting  from  the  analysis  at  location  k  in  terms  of  the  relationships 
between  the  normalized  prediction  and  observation  errors.  After  some 
manipulation,  we  can  express  the  normalized  analysis  error  variance  at 
location  k  as 


(4)2  =  1-2  1 

i~  1 


N  N 

*>ikh>k+  X  X  wik*jkMij 

'  =  i  y = i 


(16) 


where 


(17) 

■<i = («f?) + -  '?(*?*/}  -  <«p ?)«;  ■ 

(18) 

The  terms  inside  the  angled  brackets  are  error  correlations.  For  example, 
(zfzf)  and  represent  the  prediction  error  correlation  and 

observation  error  correlation,  respectively,  between  the  observation 
locations  i  and  j.  In  practice,  the  observation  error  and  prediction  error 
are  assumed  to  be  uncorrelated,  so  terms  that  contain  cross-correlations 
such  as  (ePt in  equations  (17)  and  (18)  are  equal  to  zero.  We  also 
assume  that  the  errors  associated  with  observations  made  at  different 
locations  are  uncorrelated.  This  assumption  allows  us  to  set  observation 
error  correlations  such  as  (e^e?)  equal  to  one  for  i=j,  and  equal  to 
zero,  otherwise.  These  assumptions  simplify  the  expressions  for  and 


,J  10 

vk6/’) 

(19) 

(20) 

The  computation  of  My  for  all  j  pairs  results  in  an  NxN  matrix  M, 
where  the  diagonal  term  on  row  i  is  equal  to  1  +  (ef )2  and  the  off- 
diagonal  terms  on  row  i  are  equal  to  (tfzf),  that  is,  the  prediction  error 
correlations  between  location  i  and  the  other  N  -  1  observation  locations. 
Similarly,  the  computation  of  hik  for  all  observations  results  in  an 
N-dimensional  row  vector  hk  denoting  the  prediction  error  correlation 
between  the  N  observation  locations  i  and  the  analysis  grid  point 
location  k. 

The  key  to  the  OI  methodology  is  to  determine  the  weights  Wjk  that 
minimize  the  analysis  error  variance  in  equation  (16).  As  with  any 
minimization  problem,  we  take  the  partial  derivatives  of  equation  (16) 
with  respect  to  wik  for  i  =  1, .  .  .,  equate  the  derivatives  to  zero;  and 
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solve  the  resulting  set  of  N  linear  equations  for  the  weights.  The  solution 
to  this  problem  is 


2.2  Specification  of 
Statistical  Parameters 


w*  =  M_1h*  (21) 

where  w*  is  the  ^-dimensional  row  vector  of  weights  w,*  and  M-1  is 
the  inverse  of  matrix  M.  In  other  words,  the  weight  received  by  observation 
i  for  the  analysis  at  grid  point  it  is  a  function  of  the  normalized 
observation  error,  the  normalized  prediction  error  correlations  between 
the  observation  location  i  and  the  other  N -l  observation  locations, 
and  the  normalized  prediction  error  correlations  between  the  N  observation 
locations  and  the  grid  location  k.  From  this  development  it  can  be  seen 
that  the  weights  computed  for  the  observations  are  dependent  upon  the 
specification  of  the  observation  error  estimate  E°,  the  prediction  error 
estimate  Ep,  and  the  prediction  error  correlations  (epep). 

Before  we  describe  these  quantities  in  more  detail,  it  would  be  useful 
to  elaborate  on  the  multivariate  and  three-dimensional  nature  of  the 
analysis.  Essentially,  multivariate  means  that  all  variable  types  influence 
the  analysis  of  any  one  of  the  variables.  In  the  Navy’s  MVOI,  the  analysis 
variables  are  geopotential  height  and  the  u-  and  rewind  components 
(<}>,  u,  and  v).  We  use  observations  of  height,  wind,  and  thickness 
(A<j>  -  height  differences  between  adjacent  analysis  pressure  levels)  to 
analyze  these  three  variables.  The  analyzed  geopotential  height  increment 
at  a  given  location,  then,  is  affected  not  only  by  nearby  observed  height 
increments,  but  also  by  the  observed  wind  and  thickness  increments. 
Thus,  in  equation  (1),  the  lower  case  f°'s  do  not  necessarily  represent  the 
same  variable  type  as  the  upper  case  F%  and  Fpk  and,  in  fact, 
are  usually  a  mixture  of  variable  types.  The  analysis  is  also 
three-dimensional,  which  means  that  the  observed  increments  do  not 
have  to  be  located  at  the  same  level  as  the  computed  analysis  increment. 
For  example,  wind  increments  at  250  mb  may  influence  the  height 
analysis  at  200  mb.  The  coupling  of  the  variables  in  three-dimensional 
space  is  achieved  through  the  specification  of  the  appropriate  correlation 
functions. 


In  this  section  we  describe  in  detail  how  the  statistical  parameters 
required  by  the  NOGAPS  MVOI  analysis  are  specified.  Because  of  the 
multivariate  three-dimensional  nature  of  the  analysis,  we  must  specify 
the  prediction  error  correlations  relating  different  variable  types  at  different 
horizontal  and  vertical  locations.  Thus,  we  model  the  prediction  error 
correlation,  assuming  that  it  can  be  expressed  as  the  product  of  vertical 
and  horizontal  components.  For  the  horizontal  component,  the  multivariate 
specification  of  the  prediction  error  correlations  in  the  analysis  is  achieved 
using  the  relationship  described  by  Buell  (1972).  The  horizontal  prediction 
error  correlation  model  that  is  specified  correlates  geopotential  height 
or  thickness  data  at  one  location  with  geopotentiai  height  or  thickness 
data  at  another  location,  and  is  primarily  a  function  of  the  horizontal 
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separation  between  the  two  locations.  The  multivariate  coupling  is  achieved 
by  differentiating  the  height-height  correlation  function  to  derive  the 
appropriate  prediction  error  correlation  functions  for  various  combinations 
of  <(>  or  A<|>,  u,  and  v  observations.  This  ensures  that  the  geostrophic 
relationship  between  the  analysis  increments  of  geopotential  height  and 
v/ind  is  preserved  in  the  extratropics.  It  does  not,  however,  ensure  that 
the  full  analyzed  fields  of  geopotential  height  and  wind  are  in  geostrophic 
balance.  The  geostrophic  and  nondivergent  constraints  are  controlled 
using  the  parameter  method  (Daley,  1985)  discussed  below. 

As  we  have  formulated  the  analysis,  the  parameters  that  must  be 
determined  are  the  observation  error  estimates  ( E° ),  the  prediction  error 
estimates  (Ep),  and  the  prediction  error  correlations  (e^e p).  In  general, 
prediction  error  represents  the  amount  of  error  introduced  at  a  particular 
location  due  to  the  inexactness  of  the  forecast  model,  while  observation 
error  includes  both  instrument  error  and  subgrid-scale  sampling  errors, 
where  the  measurement  is  simply  not  representative  of  features  that  can 
be  resolved  by  the  model.  Given  an  incremental  analysis  scheme  that  is 
both  multivariate  and  three-dimensional,  the  prediction  error  correlations 
must  likewise  be  defined  three-dimensionally  for  data  increments  of  all 
different  variable  types. 

Toward  this  goal,  NRL  has  developed  a  database  that  consists  of  the 
data  increments  of  height  and  wind  at  all  levels  for  all  available  radiosonde 
observations.  At  any  given  time,  30-day  histories  of  these  increments 
for  both  the  0000  UTC  and  1200  UTC  analysis  times  reside  in  on-line 
disk  storage.  As  an  example  of  the  information  contained  in  this  database, 
the  standard  deviations  of  the  500-mb  height  increments  computed  from 
a  30-day  history  of  1200  UTC  radiosonde  observations  are  plotted  in 
Figure  1  for  a  selected  area.  The  number  of  observations  available  at 
each  radiosonde  station  is  plotted  to  the  right  of  the  standard  deviation 
with  the  station  identifier  plotted  below.  If  we  overlook  small-scale 
local  variability,  we  see  that,  in  general,  the  standard  deviations  increase 
with  increasing  latitude,  while  at  the  same  latitude,  the  values  on  the 
west  coast  tend  to  be  larger  than  those  on  the  east  coast.  However, 
since  the  standard  deviations  are  computed  from  the  data  increments, 
they  reflect  both  the  observation  errors  and  the  prediction  errors  at  each 
observation  location.  A  discussion  of  how  to  partition  this  information 
into  the  separate  error  components  is  given  in  two  papers  (Hollingsworth 
and  Lttnnberg,  1986;  LOnnberg  and  Hollingsworth,  1986),  which  we 
will  henceforth  simply  refer  to  as  HLH. 

Before  we  can  describe  how  the  increments  contained  in  this  database 
are  used  to  estimate  the  statistical  parameters  required  by  the  analysis, 
some  further  theoretical  development  is  required.  From  equations  (3) 
and  (4)  it  can  be  easily  seen  that  the  increment  for  observation  i  is  also 
the  difference  between  the  observation  error  and  the  prediction  error 
at  the  observation  location  (i.e.,  F°-Fp =E°-EP).  Thus,  taking  the 
product  of  the  increments  for  observations  i  and  j  and  averaging  over 
a  large  ensemble  of  similar  realizations,  we  have  the  covariance 


cr((E°-E?)(ErEi))*(E°ErE‘Ei-E°Ei+E?Ei)  ■  (22> 
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Figure  1.  Statistics  computed  from  a  30-day  history  of  1200  UTC  radiosonde  observation  increments.  The 
50iJ-mb  height  increment  standard  deviation  is  plotted  to  the  upper  left  of  each  station,  with  the  number  of 
available  observations  plotted  to  its  right.  The  station  identifier  is  plotted  below. 


Since  we  assume  that  the  observation  error  and  prediction  error  are 
uncorrelated,  the  cross-covariance  terms  vanish.  Furthermore,  if  we  limit 
this  discussion  to  the  horizontal  component  of  the  correlation 
(i.e.,  observations  /  and  j  are  at  the  same  vertical  level),  we  can  also 
assume  that  the  observation  errors  are  uncorrelated.  Thus,  for  i  not 
equal  to  j,  we  have 

Cij  =  (E?E j)-  (23) 
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However,  observation  error  correlation  cannot  be  neglected  when  i  =  j. 
Thus,  the  variance  of  the  increments  at  location  i  is  given  by 


cu  -  ((*?f)  +  (K)2)  -  (£7)2  +  (*?)*.  (24) 

Since  we  assume  that  both  the  observation  error  and  the  prediction 
error  have  zero  mean,  we  see  from  equation  (24)  that  the  variance  of 
the  increments  is  equal  to  the  sum  of  the  squares  of  the  observation  and 
prediction  error  estimates.  If  we  further  assume  that  these  error  estimates 
do  not  vary  with  observation  location,  we  can  simply  denote  the  variance 
of  the  increments  for  all  observations  by 

C  =  {e°)2  +  {eP)2.  (25) 


From  our  database,  we  can  readily  estimate  the  correlation  CJC. 
However,  this  estimate  is  biased  because  C  contains  contributions  from 
both  prediction  error  and  observation  error.  What  we  want  to  estimate 
is  the  normalized  prediction  error  correlation  (fftf)-  But  we  know  that 


and 


S 


(iM) 

(ef-rf 


(26) 


(27) 


From  equations  (26)  and  (27)  we  see  that 


Sl 

c 


(28) 


The  presence  of  the  square  of  the  observation  error  estimate  in 
equation  (24)  biases  the  correlation  estimates  obtained  from  the  data 
by  the  factor  of  R,  where 


R  = 


(29) 


As  observation  separation  approaches  zero,  then,  the  correlation  estimate 
Cfj  / C  will  approach  R,  rather  than  1.0  as  would  be  expected.  The 
denominator  of  equation  (29)  is  simply  the  variance  of  the  increments. 
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Thus,  if  we  can  determine  the  value  of  R,  we  can  compute  [eP)  and 
subsequently  (tf°)  ,  thereby  partitioning  the  variance  of  the  increments 


into  the  contributions  of  prediction  error  and  observation  error.  From 
the  database,  we  can  estimate  the  horizontal  correlations  Cy/C 
for  observation  increments  at  various  separations.  By  fitting  curves  to 
these  results,  we  can  estimate  the  value  of  R  at  each  analysis  level.  The 
function  that  best  fits  the  scatter  plot  of  Cy/C  versus  distance  becomes 
our  horizontal  height-height  correlation  function. 

The  North  American  continental  radiosonde  stations  are  frequently 
used  for  these  computations  because  the  stations  are  dispersed  over  a 
range  of  latitudes,  the  quality  of  the  observations  is  good,  and  the 
instruments  used  in  the  U.S.  and  Canada  have  similar  error  characteristics. 
Thus,  from  our  database,  we  used  the  1200  UTC  height  increments 
from  March  1990  for  an  area  extending  from  25°N  to  65°N  and  from 
60°W  to  130°W  to  compute  sample  estimates  of  Cy  /C  at  several  different 
analysis  levels.  Increments  for  a  particular  radiosonde  station  were  used 
in  the  calculations  only  if  the  station  reported  at  least  18  times  during 
the  month.  Furthermore,  only  increments  that  were  not  rejected  by  the 
global  MVOI  analysis  were  included.  For  each  station,  the  mean  value 
of  its  increments  was  determined  and  subtracted  from  each  increment 
to  remove  the  effects  of  both  observational  bias  and  mean  background 
errors.  For  each  analysis  level,  horizontal  correlation  estimates  were 
computed  and  placed  in  bins  based  on  the  distance  between  the  station 
pairs.  The  points  plotted  in  Figures  2  and  3  represent  the  average 


DISTANCE  (km) 


Figure  2.  Estimates  of  the  horizontal  correlation  as  a 
function  of  distance  for  radiosonde  height  increments  at 
250  mb,  500  mb,  and  850  mb. 


DISTANCE  (km) 


Figure  3.  Estimates  of  the  horizontal  correlation  as  a 
function  of  distance  for  radiosonde  height  increments  at 
20  mb,  50  mb,  and  100  mb. 
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correlations  for  100-km  centered  bins.  Only  points  where  a  sufficient 
number  of  station  pairs  were  available  are  shown. 

In  computing  the  correlation  estimates,  we  also  determined  sample 
estimates  of  the  variance  of  the  increments,  C,  at  each  level.  To  determine 
the  value  of  R  at  each  level,  we  must  select  a  correlation  function  to 
fit  the  computed  correlations.  There  are  several  suitable  functions  that 
can  be  considered.  For  example,  early  applications  of  MVOI  in  global 
data  assimilation  used  the  negative  squared  exponential  (NSE)  function 
(Bergman,  1979;  Lorenc,  1981).  As  described  in  HLH,  the  horizontal 
correlation  function  used  by  the  ECMWF  is  the  summation  of  Bessel 
functions.  More  recently,  Thi^baux  et  al.  (1990)  determined  that  the 
optimal  correlation  function  for  the  NMC  global  data  assimilation  system 
was  a  third-order  autoregressive  (TOAR)  function.  Franke  et  al.  (1988) 
examined  the  statistical  properties  of  various  functions,  including  the 
NSE,  second-order  autoregressive  (SOAR),  and  TOAR,  along  with  their 
performance  when  they  were  used  as  th*  h^rizontal  correlation  function 
for  an  01  analysis.  It  was  found  that  ;!,d  SOAR  possessed  all  of  the 
statistical  properties  required  for  an  MVOI  analysis  correlation  function 
and  that  it  could  be  adequately  fitted  to  correlations  estimated  from 
observations.  Furthermore,  it  was  found  that  the  quality  of  the  analyses 
produced  using  the  SOAR  were  comparable  to  those  produced  using  the 
more  complicated  TOAR  and  superior  to  those  produced  using  the  NSE. 
Based  on  the  results  of  this  study  and  the  fact  that  the  SOAR  function 
can  be  determined  with  great  computational  efficiency,  we  selected  the 
SOAR  as  the  correlation  function  for  the  Navy’s  MVOI  analysis. 

The  suitability  of  the  SOAR  function  for  our  system  is  substantiated 
by  the  fit  to  the  radiosonde  height  data  shown  by  the  curves  plotted  in 
Figures  2  and  3.  The  zero-intercept  of  each  curve  gives  us  the  value  of 
R  at  that  level.  From  this  value,  we  can  determine  the  appropriate  partition 
of  prediction  error  and  observation  error.  For  example,  at  500  mb  the 
sample  variance  C  was  206  m2.  From  Figure  2  we  can  see  that 

«  p 

/?500  =  0.6.  Thus,  using  equation  (29)  we  can  determine  that  E 50o  is 

-  O 

approximately  1 !  m  and  from  equation  (28)  E  50o  is  approximately  9  m. 
In  titis  way,  we  obtained  the  partition  of  prediction  and  observation 
errors  for  height  that  are  used  in  the  analysis  (Fig.  4).  We  assume  that 
these  height  prediction  errors  are  applicable  at  45°N,  the  center  of  the 
area  for  which  the  increment  correlations  were  computed.  We  also  assume 
that  these  observation  errors  are  valid  only  for  the  radiosonde  height 
data.  Observation  errors  for  other  data  types  must  be  addressed  separately, 
as  must  the  prediction  errors  for  wind  data. 

While  there  is  some  variation  in  the  functions  fitted  from  level  to 
level,  we  have  formulated  the  analysis  to  use  the  same  horizontal  height 
correlation  function  at  all  analysis  levels.  In  particular,  we  selected  the 
function  fitted  to  the  500-mb  level  data  shown  in  Figure  2.  To  justify 
the  choice  of  the  500-mb  function  for  all  levels,  we  convert  all  of  the 
functions  shown  in  Figure  2  and  3  to  true  correlation  functions  with  a 
zero-intercept  of  one  (Fig.  5).  The  fitted  250-mb  correlation  function  is 
identical  to  the  one  fitted  at  500  mb.  The  850-mb  function  is  slightly 
broader  and  is  identical  to  the  function  fitted  at  20  mb.  The  functions 
for  the  lower  stratospheric  levels  (100  mb  and  50  mb)  are  broader  still. 
This  broadening  is  consistent  with  the  results  shown  in  HLH  for  the 
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Figure  4.  Prediction  error  and  observation  error  for 
radiosonde  height  data  at  each  analysis  level. 


Figure  5.  Horizontal  correlation  functions  shown  in 
Figures  2  and  3,  after  correcting  for  biases  introduced 

by  the  observation  error.  { 


ECMWF  data  assimilation  system,  although  they  showed  a  greater  contrast 
between  the  tropospheric  and  stratospheric  functions. 

The  exact  form  of  the  horizontal  correlation  function  used  for  height 
and  thickness  data  in  NOGAPS  is  actually  a  modified  SOAR  function  i 

given  by 

(♦/♦/)  =  1  -  c2  +  c2  ( 1  +  cjr#)  exp  [  -  c  ,r(y] ,  (30) 

where  C]  and  C2  are  constants,  ry  is  the  horizontal  (great-circle)  distance  4 

separating  observations  i  and  j,  <J>  is  the  normalized  geopotential  height, 
and  exp  is  the  exponential  function.  This  normalized  correlation  function 
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has  a  value  of  1.0  for  co-located  observations  (r,y  =  0),  and  decreases 
with  increasing  separation  r,y.  How  rapidly  the  function  decreases  is 
controlled  by  the  values  assigned  to  the  constants  Cj  and  c 2.  For  the 
SC  R  function  fitted  to  the  500-mb  level  data  in  Figure  2,  the  values 
for  C|  and  c 2  are  2.6  and  0.9,  respectively,  with  the  observation  separation, 
r,  expressed  in  units  of  1000  km. 

The  normalized  horizontal  correlation  functions  for  other  data-type 
pairs  are  derived  from  the  first  and  second  derivatives  of  equation  (30). 
Differentiation  with  respect  to  r  yields 

=  ~cv  {rij)  exP  [  “  ciru] •  (31) 


.  2 

dr2  * 


(1  -  C|f}y)  exp[-Cj^y] . 


(32) 


The  constant  cv  is  simply  the  reciprocal  of  the  component  length 
scale  Lc  discussed  in  HLH.  For  the  ECMWF  system,  they  found 
tropospheric  length  scales  on  the  order  of  300-350  km  and  stratospheric 
length  scales  on  the  order  of  400-450  km.  Thus,  the  ECMWF  analysis 
is  broken  into  tropospheric  and  stratospheric  layers  using  different 
correlation  functions  that  reflect  the  aforementioned  length  scales. 
However,  ECMWF  found  that  they  must  exercise  great  care  in  combining 
the  results  of  the  layer  analyses  to  produce  the  total  analysis.  For  NOGAPS, 
we  found  that  Lc  varied  from  approximately  400  km  at  500  mb  and  250  mb 
to  approximately  440  km  at  50  mb.  This  small  contrast  between  the 
tropospheric  and  stratospheric  length  scales  in  our  system  indicates  that 
it  is  quite  reasonable  to  perform  a  full-depih  analysis.  Using  the  current 
values  of  cj  and  cj,  we  obtain  a  value  of  c„=  2.47,  which  corresponds 
to  a  length  scale  of  400  km.  We  apply  this  value  at  all  analysis  levels. 

Using  the  derivatives  of  the  height-height  correlation  model,  and 
converting  from  natural  to  rectangular  coordinates,  the  functions 
correlating  the  other  possible  combinations  of  normalized  variables  at 
locations  1  and  j  are  given  by 


(34) 


(<My)  =  -("«*/} 

(vAj)  =  cos  0 V'ff- v)j  , 


(35) 

(36) 
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(37) 


(u(Vj)  =  -  sin  e  cos  e  — 2“ 


1  -2v) 


r  3r 


cos2  0  +  -4 


^sin2eU 

c2  r 

V 


dr 


_/vcos20  +  ijlVsin2e]?!&^ 


Drz 


M =~lLjr si"2  9  +  :5 cos2  e)  7 


(38) 

(39) 


(40) 


(41) 


where  0  is  the  angle  of  rotation  between  the  rectangular  coordinate 

system  and  the  natural  coordinate  system  whose  x-axis  lies  along  the  < 

line  connecting  observation  locations  i  and  j.  The  parameters  p  and  v 

control  the  geostrophic  coupling  and  divergence,  respectively.  Both 

pandv  range  from  0  to  1,  with  p  =  1  representing  full  geostrophic 

coupling  of  the  wind/height  correlations  and  v  =  0  representing  fully 

rotational  flow.  Outside  the  tropics,  we  currently  set  p  =  0.9  and  4 

v  =  0.  In  the  tropics,  the  geostrophic  constraint  is  relaxed  (p  =  0.5) 

and  the  divergence  is  permitted  to  be  nonzero  (v  =  0. 1 ).  Plots  of  sample 

correlation  functions  for  the  various  data-type  pairs  in  the  extratropics 

are  shown  in  Figures  6(a-f).  For  comparison,  the  same  functions  are 

shown  in  Figures  7(a-f),  but  with  the  values  of  p  and  v  modified  for  the  ^ 

tropics. 

Estimating  the  vertical  correlation  function  is  not  quite  as 
straightforward  as  determining  the  horizontal  correlation  function.  Due  to 
the  way  that  we  have  formulated  the  MVOI,  a  single  vertical  correlation 
function  is  used  for  both  height  and  wind  data.  While  wind  observations 
have  little  or  no  vertical  correlation  of  observation  error,  radiosonde  i 

height  observations  possess  strong  vertical  observation  error  correlations. 

Therefore,  the  assumptions  that  lead  to  the  definition  of  Cjj  in  the 
horizontal  case  do  not  apply  here.  For  observations  at  levels  k  and  /  at 
the  same  horizontal  location,  instead  of  equation  (23),  we  obtain  from 
equation  (22)  | 

c«=(^)  + (*?*?)■  (42) 
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Figure  6.  Horizontal  components  of  the  prediction  error  correlation  functions  used  in  the  NOGAPS 
MVOl  extratropical  analysis,  assuming  p  =  0.9  and  V  =  0.0.  (a)  (ftp),  (b)  (pu),  (c)  (<pv),  (d)  (uv), 
(e)  ( uu ),  and  (f)  (w). 
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Multiplying  each  term  in  equation  (42)  by  1, 


cki  ~ 


EpkEp,  k  i  --- 


E°k  El 


(43) 


Then, 

Cu  =  r£,  +  E°kE°,  (44) 

where  and  rkl  are  estimates  of  the  vertical  prediction  error  correlation 
and  vertical  observation  error  correlation,  respectively,  between  levels 
k  and  /.  These  two  correlation  estimates  are  the  only  quantities  in 
equation  (44)  that  we  cannot  readily  estimate  from  the  aforementioned 
database.  We  then  assume  that  the  vertical  observation  error  correlation 
can  be  represented  by 

r°u  =(0.95)",  (45) 

where  n  is  the  absolute  value  of  the  difference  between  integer  levels 
k  and  /.  Substituting  equation  (45)  into  equation  (44),  we  solve  for  the 
vertical  prediction  error  correlations  for  height.  The  radiosonde 
increment  database  is  used  to  compute  estimates  of  the  vertical 
correlation  functions  for  height  and  wind.  Figure  8  displays  the  estimated 


Figure  8.  The  estimated  height  correlation 
function  for  an  observation  located  at 
500  mb,  along  with  the  adjusted  function 
obtained  by  taking  into  account  the  vertical 
correlation  of  height  observation  error. 
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height  correlation  function  for  an  observation  located  at  500  mb,  along 
with  the  adjusted  function  obtained  by  taking  into  account  the  vertical 
correlation  of  height  observation  error.  Figure  9  shows  the  adjusted 
height  correlation  function  plotted  alongside  the  estimated  vertical  wind 
component  correlation  function.  Since  we  must  select  only  one  vertical 
correlation  function,  the  function  we  have  chosen  represents  a  compromise 
between  the  two  correlation  function  estimates,  as  depicted  in  Figure  9. 

The  curves  plotted  in  Figures  8  and  9  are  discrete,  rather  than 
continuous,  curves.  The  particular  form  that  we  have  chosen  for  the 
vertical  correlation  function  is  a  simple  exponential  function  whose 
shape  and  magnitude  were  determined  from  the  actual  data.  It  is 
represented  by 


c?. 


(46) 


where  b  and  d  are  predetermined  constants,  equal  to  1.8  and  3600, 
respectively;  n  and  Zj  are  the  standard  heights  in  meters  at  the  analysis 
pressure  levels  nearest  the  observations  for  300  mb  and  above,  and 
some  specified  value  that  is  less  than  the  standard  height  for  locations 
at  400  mb  and  below.  This  modification  of  the  standard  heights  is 
made  at  lower  levels  to  force  the  vertical  correlation  to  drop  off  more 
rapidly,  thus  better  decoupling  the  boundary  layer  from  the  lower  and 
middle  tropospheric  levels.  As  stated,  the  same  vertical  model  is  used 
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Figure  9.  Comparison  of  the  vertical 
correlation  function  used  in  the 
analysis  (dashed  line)  with  the  adjusted 
height  correlation  function  from 
Figure  8  and  the  estimated  vertical 
wind  component  correlation  function. 
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for  geopotential  height  and  wind  data.  However,  if  one  of  the  variables 
under  consideration  is  a  thickness  observation,  the  vertical  model  is 
modified  accordingly  by  using  the  two  bracketing  heights  (Lorenc,  1981). 
Figure  10  illustrates  the  vertical  correlation  function  computed  in  reference 
to  an  observation  at  500  mb,  using  equation  (46). 

Once  the  horizontal  and  vertical  components  of  the  correlation  function 
have  been  determined,  the  total  prediction  error  correlation  for  any 
variable  at  location  i  with  any  4>,  u,  v,  or  A<}>  data  at  location  j  is  given 
by 


Sij  =  *0  Sjj  ,  (47) 

where  Sfj  is  the  horizontal  correlation  function  appropriate  for  the 
data-pair  type.  It  should  also  be  noted  that  in  equations  (30)-(47), 
the  observation  location  j  could  just  as  easily  be  the  grid  location  k.  In 
other  words,  the  same  expressions  are  used  to  compute  correlations 
between  observation  locations  and  analyzed  grid  point  locations. 

To  maintain  the  geostrophy  of  the  correlation  functions  for  the 
normalized  variables  defined  in  equations  (30)-(41),  Lorenc  (1981) 
pointed  out  that  the  following  relationship  must  be  satisfied 


(48) 


This  constraint  provides  us  with  a  formula  for  computing  the  prediction 
errors  for  the  wind  components  [E^j  from  the  predict'd*  errors  for 


Figure  10.  The  continuous  representation  of  the  vertical 
correlation  function  used  in  the  analysis,  computed 
relative  to  an  observation  at  500  mb  using  b  =  1.8  and 
d  =  3600. 
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height  (££)  that  were  derived  from  the  radiosonde  database.  The  prediction 
errors  for  the  wind  components  plotted  in  Figure  11  were  computed 
from  those  plotted  for  height  using  the  Coriolis  parameter  J,  that  is 
valid  at  45°N.  For  comparison,  the  partitioning  of  the  wind  observation 
and  prediction  errors  was  done  in  the  same  fashion  as  was  done  for  the 
heights,  at  least  in  the  troposphere  and  lower  stratosphere  where 
the  radiosonde  increments  were  available.  At  most  levels,  the  results 
obtained  by  deriving  the  prediction  errors  for  the  wind  components 
from  those  for  height  show  good  agreement  with  what  would  have  been 
obtained  by  performing  the  partitioning  process  for  wind  components. 
At  the  higher  stratospheric  levels,  where  the  analysis  background  is 
more  greatly  influenced  by  the  top  level  of  the  NOGAPS  forecast  model, 
this  agreement  breaks  down.  At  these  upper  analysis  levels  we  have 
chosen  to  use  wind  component  prediction  errors  that  are  consistent  with 
the  height  prediction  errors  and  to  assign  observation  errors  that  are 
typical  of  other  stratospheric  levels. 

To  complete  the  specification  of  the  statistical  parameters  for  the  MVOI, 
we  must  determine  values  for  the  observation  errors  for  the  different 
types  of  observational  data  used  by  the  analysis.  A  detailed  description 
of  the  various  data  types  and  their  global  distribution  is  presented  in  the 
next  section.  We  have  already  shown  how  the  observation  errors  for 
radiosonde  data  are  determined.  The  magnitude  of  the  observation  error 
for  other  types  of  data  is  also  estimated  by  examining  the  statistical 
properties  of  their  differences  from  the  analysis  background  and  taking 
into  account  the  contribution  of  the  prediction  error  to  these  statistical 
estimates.  The  resulting  observation  error  estimates  for  the  different 
data  types  are  summarized  in  Table  2  in  the  next  section.  These  estimates 


Figure  11.  Prediction  error  and  observation  error  for 
radiosonde  wind  data  at  each  analysis  level. 
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3.0  Data  Processing 


3.1  The  Global  Database 

3.1.1  Radiosonde  Data 


are  consistent  with  those  described  by  Shaw  et  al.  (1987)  for  the  ECMWF 
data  assimilation  system.  In  most  cases,  the  observation  errors  are  assumed 
to  be  horizontally  homogeneous,  that  is,  they  do  not  vary  from  one 
x,  y  location  to  another. 


The  historical  evolution  of  the  global  database  utilized  by  the  world’s 
meteorological  forecast  centers  is  described  by  Dey  (1989).  With  only 
a  few  exceptions,  ail  operational  forecast  centers  have  access  to  the 
same  observational  data.  Atmospheric  measurements  are  made  by 
radiosondes,  pilot  balloons,  aircraft,  ships,  buoys,  and  ground  stations. 
Satellite-derived  measurements  are  made  using  polar-orbiting  and 
geostationary  satellites.  Synthetic  observations  are  created  to  provide 
information  in  data-sparse  areas.  In  this  section  we  will  describe  the 
different  types  of  observations  used  by  the  Navy’s  atmospheric  analysis 
systems,  their  distribution  in  space  and  time,  and  the  types  of  processing 
required  to  prepare  the  data  for  the  MVOI  analysis. 

The  observations  reside  in  FNOC  data  files  using  packed  formats 
that  are  described  in  the  FNOC  Computer  User’s  Guide  (1987).  Prior 
to  the  analysis,  data  preprocessing  software  accesses  these  files  and 
rewrites  the  observations  in  the  FGGE*  format.  This  standardization  of 
format  permits  the  analysis  software  to  be  easily  executed  using  either 
operational  or  research  data  sets. 

The  data  preprocessing  that  takes  place  prior  to  the  analysis  performs 
several  different  quality  tests  on  the  data  (Baker,  1991),  including  checks 
for  location,  timeliness,  and  vertical  consistency.  As  a  result  of  these 
checks,  each  piece  of  information  is  associated  with  a  flag  that  indicates 
whether  that  observation  should  be  rejected,  subjected  to  further  checking, 
or  accepted  as  is.  In  some  cases,  the  flag  value  will  indicate  that  a 
particular  correction  was  made  or  that  the  observation  was  not  checked 
at  all.  These  quality  flags  are  written  with  the  observations  in  the  FGGE 
format,  and  are  used  by  the  analysis  to  make  additional  quality  control 
decisions. 

Within  the  analysis,  each  data  type  undergoes  further  processing. 
The  FGGE-formatted  data  are  read  and,  if  necessary,  the  reported 
observations  are  converted  to  observations  of  geopotential  height,  wind, 
or  geopotential  thickness.  The  observed  increments  are  computed  by 
subtracting  the  background  values  from  the  observed  values.  Some  of 
the  high-resolution  observations  are  thinned  and  averaged  before  they 
are  used  in  the  analysis.  Certain  known  errors  are  corrected,  and  the 
preliminary  quality  flags  are  used  to  further  assess  and  assign  analysis 
quality  flags  to  the  observed  increments.  While  all  data  types  have 
some  processing  steps  in  common,  they  also  have  their  own  unique  set 
of  requirements,  so  each  data  type  will  be  discussed  in  more  detail. 

The  radiosonde  is  still  the  backbone  of  the  observational  network, 
providing  twice-daily  measurements  of  atmospheric  temperature,  moisture, 
and  (indirectly)  wind  from  the  surface  to  the  upper  stratosphere  at  fixed 


♦First  GARP  Global  Experiment  (GARP  was  the  Global  Atmospheric  Research  Program). 
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sites  around  the  globe.  From  these  measurements,  observations 
of  geopotential  height  are  computed  at  the  standard  pressure  levels,  and 
observations  of  temperature,  dew-point  depression,  wind  speed  and  wind 
direction  are  determined  at  both  standard  and  significant  levels.  Typically, 
ove  700  radiosonde  stations  report  each  day  for  0000  UTC  and  1200  UTC, 
while  about  50  soundings  are  received  for  0600  UTC  and  1800  UTC.  The 
radiosonde  network  is  shown  in  Figure  12,  While  the  northern  midlatitudes 
are  reasonably  well  covered  by  these  observation  sites,  there  are  relatively 
few  stations  scattered  over  the  rest  of  the  world.  For  example, 
approximately  90%  of  the  radiosonde  sites  are  in  the  Northern 
Hemisphere,  and  only  a  little  over  10%  of  the  stations  are  located  in  the 
tropics.  The  vertical  distribution  of  radiosonde  observations  is  also 
non-uniform.  While  tropospheric  coverage  is  quite  good,  with  80%  of 
the  soundings  normally  reaching  100  mb,  coverage  of  the  stratosphere 
is  less  reliable.  Only  about  40%  of  the  reported  soundings  reach  20  mb 
and  as  few  as  10-15%  reach  10  mb. 

The  total  number  of  reports  that  may  be  read  into  the  analysis  is 
1000,  which  is  sufficient  to  include  the  700  or  so  World  Meteorological 
Organization  (WMO)  stations,  plus  ship  soundings  and  synthetic 
observations.  As  these  reports  are  read,  the  reported  latitudes  and 
longitudes  are  checked  to  remove  any  duplicates.  Only  mandatory-level 
height  and  wind  data  are  retained,  with  one  exception.  Since  one  of  the 
analysis  levels  (925  mb)  is  not  a  mandatory  level,  a  925-mb  observation 
is  created  from  two  bracketing  significant-level  observations  reported 
between  1000  mb  and  850  mb.  If  only  one  such  significant  level  is 
available,  either  the  1000  mb  or  850  mb  report  may  be  used  for  the 
other  level.  If  there  are  no  significant-level  observations  in  that  range, 
the  925-mb  observation  is  set  to  missing,  since  there  is  no  new  information 
gained  by  interpolating  between  two  mandatory-level  observations.  The 
interpolation  is  linear  with  respect  to  pressure,  and  heights  are  processed 
independently  of  the  wind  speed  and  direction. 

A  radiation  correction  is  applied  to  the  height  reports  obtained  from 
the  U.S.  and  the  Canadian  radiosondes.  The  appropriate  corrections  for 
each  country  were  determined  from  the  radiosonde  increment  database, 
described  in  Section  2.2,  and  from  the  results  of  the  international 
comparison  of  radiosonde  instruments  conducted  by  the  WMO  at  Wallops 
Island  (Nash  and  Schmidlin,  1987).  The  magnitude  of  the  corrections  is 
a  function  of  sun  angle.  The  sun  angle  is  determined  from  the  station 
latitude  and  longitude;  the  month,  day,  and  time  of  day  of  the  sounding; 
and  the  balloon  ascent  rate. 

Other  types  of  corrections  are  also  made  to  certain  radiosonde 
observations.  For  example,  the  height  reports  from  China  are  corrected 
for  observed  systematic  biases.  The  magnitude  of  this  bias  is  determined 
from  the  radiosonde  increment  database,  and  the  correction  made  always 
increases  the  reported  height  at  each  pressure  level.  This  correction  is 
imposed  on  all  stations  using  block  numbers  beginning  with  50-59. 
Finally,  while  we  do  not  correct  for  the  Indian  height  reports  at  this 
time,  we  do  flag  all  height  reports  from  blocks  42  to  43  as  suspect. 

After  the  height  corrections  are  made,  all  height  observations  are 
converted  to  geopotential  height  and  the  u-  and  rewind  components 
are  computed  from  the  wind  speed  and  direction.  For  all  wind  data, 
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Figure  12.  Global  distribution  of  radiosonde  observations. 


if  the  reported  speed  is  greater  than  some  predefined  limit,  the  wind 
components  are  flagged  for  rejection.  The  upper  limit  is  a  function  of 
level,  and  varies  from  60  m/sec  at  1000  mb  to  160  m/sec  at  400  mb  and 
higher  levels.  Using  these  test  results  and  the  quality  flags  assigned  to 
the  variables  by  the  preliminary  checks,  the  analysis  quality  flag  is  set 
to  indicate  which  data  have  been  rejected.  Otherwise,  the  analysis  qual¬ 
ity  flag  will  reflect  the  preliminary  flag  values  assigned  to  the  height 
and  wind  observations. 

The  analysis  quality  flag  consists  of  a  two-digit  number  and  is  illustrated 
in  Table  1.  In  general,  the  10’s  digit  indicates  which  variables  are 
available,  and  the  l’s  digit  indicates  which  variables  should  be  checked 
further  by  the  analysis.  If  height  and  wind  data  are  available  at  a  particular 
x,  y,  z  location,  the  10’s  digit  is  set  to  -1.  If  only  the  height  report  is 
available,  the  10’s  digit  is  0,  while  wind-only  observations  are  represented 
by  a  10’s  digit  equal  to  1.  If  a  particular  variable  was  flagged  for 
rejection  by  the  preceding  quality  control  checks,  that  variable  is 
considered  to  be  unavailable  when  the  value  of  the  analysis  quality 
flag  is  determined.  If  the  entire  observation  (all  variables  at  a  given 
location)  is  to  be  rejected  outright,  the  analysis  quality  flag  is  set  to  99. 

If  all  the  variables  in  the  observation  r  accepted  by  the  preliminary 
quality  tests,  the  one’s  digit  of  thi  analysis  quality  flag  is  set  to  0.  Data 
requiring  further  checking  by  the  analysis  are  indicated  by  setting  the 
l’s  digit  equal  to  1  for  the  height  observation,  2  for  the  wind  components, 
or  3  for  all  the  variables.  The  same  convention  for  the  analysis  quality 
flag  is  followed  for  all  data  sources. 

3.1.2  Pilot  Balloon  Data  Pilot  balloons  (pibals)  also  provide  vertical  soundings  of  wind  speed  1 

and  direction,  but  do  not  generally  provide  the  depth  of  information 
available  from  the  radiosondes.  Only  about  one-third  of  the  pibal  soundings 
reach  as  high  as  100  mb.  Furthermore,  on  a  global  scale,  their  numbers 
are  relatively  small,  with  150-200  pressure-level  soundings  available  at 
each  analysis  time.  However,  their  spatial  distribution  is  evenly  divided  ( 


Table  1.  MVOI  analysis  quality  flag. 


Data  Available* 

Flag  Value 

Accept 

Check  Further 

Reject 

4>  only 

0 

— 

— 

1 

— 

* 

— 

99 

— 

— 

u,  vonly 

10 

U,  V 

— 

— 

12 

— 

u,v 

— 

99 

— 

— 

u,v 

-10 

4 ),u,v 

— 

— 

-11 

U'V 

<t> 

— 

-12 

* 

u,v 

— 

-13 
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— 

99 

— 
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U,V 

'Some  values  rejected  earlier  may  be  considered  unavailable. 
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between  the  hemispheres,  with  almost  half  of  the  sounding  sites  located 
in  the  tropics  (Fig.  13).  In  this  respect,  they  are  an  important  supplement 
to  the  radiosonde  network. 

Storage  space  is  available  in  the  analysis  for  up  to  650  pibals,  which 
is  more  than  sufficient  to  handle  all  the  available  soundings.  Pibals  are 
wind-only  soundings  that  are  reported  at  either  fixed  pressures  or  at 
fixed  heights,  but  not  both.  Only  the  mandatory  pressure-level  observations 
are  saved  for  the  analysis.  If  the  sounding  is  reported  at  fixed  heights, 
the  600-m  and  900-m  wind  speeds  and  directions  are  vertically  interpolated 
to  create  a  925-mb  wind  observation;  no  other  wind  observations 
are  used  from  the  height-level  soundings.  In  all  cases,  wind  speeds 
greater  than  some  predefined  limit  are  ignored.  That  limit  ranges  from 
60  m/sec  at  1000  mb  to  160  m/sec  at  300  mb  and  above.  The  observations 
exceeding  these  limits  are  flagged  for  rejection.  The  analysis  quality 
flags  for  the  other  observations  simply  reflect  the  decisions  made  by 
the  preliminary  quality  checks. 

3.1.3  Surface  Data  Conventional  surface  observations  are  those  taken  by  land  stations, 

fixed  and  drifting  buoys,  coastal  marine  stations,  and  ships.  Since  surface 
observations  are  reported  more  frequently  than  upper-air  observations, 
the  number  of  reports  is  more  constant  with  time,  with  around  7000 
conventional  surface  observations  available  at  each  of  the  primary  analysis 
times  (0000  UTC,  0600  UTC,  1200  UTC,  and  1800  UTC).  Approximately 
90%  of  these  surface  reports  are  from  land  stations.  Although  10  times 
greater  in  number  than  the  radiosonde  stations,  the  spatial  distribution 
of  the  surface  stations  is  similarly  skewed  toward  the  Northern  Hemisphere 
(Fig.  14).  This  is  true  not  only  of  the  land  stations,  but  the  marine  reports 
as  well.  For  example,  of  the  700  or  so  ship  and  buoy  observations 
received  each  analysis  time,  nearly  90%  are  located  in  the  Northern 
Hemisphere.  The  variables  normally  reported  by  the  surface  platforms 
include  sea-level  pressure,  temperature,  moisture,  wind  speed  and  wind 
direction.  A  very  small  percentage  of  land  stations  report  station  pressure  or 
the  height  of  a  standard  pressure  surface  rather  than  sea-level  pressure. 

As  the  data  are  read,  duplicate  reports  are  identified  by  comparing 
latitudes,  longitudes  and  block/station  names.  Reports  at  the  same  location 
or  with  the  same  identifier  are  removed  (except  for  observations  with 
generic  station  names,  such  as  SHIP,  BUOY,  etc.).  Thus,  only  one 
observation  per  platform  is  used  in  the  analysis,  even  if  that  platform 
reports  every  hour.  Since  the  data  records  for  the  analysis  time  are  read 
first,  followed  by  observation  records  at  ±1,  ±2,  then  ±3  hours  from  the 
analysis  time,  the  most  timely  observation  from  each  platform  is  saved. 
We  allow  up  to  8000  surface  reports  to  be  stored,  which  is  sufficient 
to  handle  the  typical  number  of  reports  (around  7000). 

The  treatment  of  the  wind  data  is  fairly  straightforward.  First,  surface 
wind  observations  over  land  are  not  used  in  the  analysis.  Such  observations 
are  too  often  unrepresentative  of  the  wind  in  the  free  atmosphere  because 
of  the  effects  of  terrain  or  shallow  radiation  inversions.  Marine  winds 
flagged  for  rejection  by  the  preliminary  quality  checks  are  assigned 
an  analysis  quality  flag  indicating  rejection,  and  the  wind  components 
are  set  to  missing.  Wind  reports  with  speeds  of  greater  than 
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50  m/sec  are  similarly  rejected.  Otherwise,  as  long  as  the  wind  direction  is 
valid  (between  0°  and  360°),  the  w-  and  v-wind  components  are  computed 
from  the  reported  speed  and  direction. 

Specification  of  a  geopotential  height  observation  from  the  surface 
data  is  more  complex,  since  the  various  types  of  surface  observations 
are  reported  in  different  ways.  Generally,  land  stations  and  marine 
platforms  report  sea-level  pressure.  A  1000-mb  geopotential  height 
observation  is  computed  from  the  sea-level  pressure  using  the  hypsometric 
equation,  where  the  layer  temperature  is  assumed  to  be  the  reported 
surface  temperature.  The  reported  pressure  must  be  between  920  mb 
and  1080  mb  and  the  surface  temperature  between  -80°C  and  50°C,  or 
the  data  is  flagged  for  rejection.  An  added  complication  arises,  since 
FNOC  stores  only  the  last  two  digits  of  the  surface  pressure.  Thus,  50 
could  represent  either  950  mb  or  1050  mb,  with  no  indication  of  which 
is  correct.  Thus,  the  alternative  pressure  is  carried  along  with 
each  observation  and  is  also  converted  to  a  1000-mb  height.  After  the 
observations  are  differenced  with  the  background,  a  decision  is  made  as 
to  which  of  the  alternative  values  is  the  correct  one. 

Over  land,  some  stations  report  station  pressure  and  elevation. 
Observations  reporting  elevations  of  less  than  -400  m  or  greater  than 
4000  m  above  sea  level  are  flagged  for  rejection.  Otherwise,  the  elevation 
is  used  to  assign  the  observation  to  the  nearest  analysis  level.  To  calculate 
geopotential  height  at  the  reported  elevation,  we  first  estimate  the  standard 
atmosphere  pressure  at  the  station’s  elevation  (following  Haltiner  and 
Martin,  1957)  and  compare  it  to  the  reported  pressure.  This  comparison 
resolves  discrepancies  that  may  result  by  assigning  the  wrong  100’s 
digit  to  the  two-digit  pressure  report.  If  the  station  elevation  is  less  than 
437  m,  the  report  is  assigned  to  the  1000-mb  analysis  level.  In  this 
case,  if  the  absolute  difference  from  the  standard  atmosphere  is  more 
than  50  mb,  the  reported  pressure  is  adjusted  by  100  mb  in  the  appropriate 
direction.  At  elevations  of  437  m  and  higher,  if  the  pressure  difference 
is  more  than  100  mb,  the  reported  pressure  is  incrementally  reduced  by 
100  mb  until  the  difference  from  the  standard  pressure  at  that  elevation 
is  less  than  100  mb.  If  the  remaining  difference  is  still  more  than  50  mb, 
one  final  100-mb  adjustment  is  made  to  the  pressure,  with  the  direction 
of  the  adjustment  determined  by  the  sign  of  the  difference.  Finally, 
the  height  of  the  standard  atmosphere  is  determined  hydrostatically 
from  the  standard  pressure  at  the  station’s  elevation.  The  difference 
between  the  reported  elevation  and  the  computed  standard  height  is  added 
to  the  standard  height  for  the  assigned  analysis  level,  thereby  extrapolating 
the  estimated  increment  at  the  reported  elevation  to  create  an  observa¬ 
tion  of  geopotential  height  at  the  assigned  analysis  level.  The  observation 
retains  the  preliminary  quality  flag  of  the  reported  station  pressure,  and 
the  analysis  quality  flag  is  determined  accordingly. 

Finally,  rather  than  reporting  station  pressure  or  estimating 
the  equivalent  sea-level  pressure,  some  of  the  stations  at  higher 
elevations  report  geopotentia)  height  at  the  nearest  mandatory  pressure 
level — either  850  mb,  700  mb,  or  500  mb.  In  these  cases,  since  these 
pressure  levels  are  also  analysis  levels,  no  further  modification  to  the 
reported  data  is  necessary. 
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3.1.4  Aircraft  Data 


Aircraft  reports  include  both  pilot  reports  and  automated  reports. 
Regardless  of  their  source,  all  aircraft  reports  contain  single-level  wind 
speed  and  direction,  with  the  location  of  each  report  identified  by  latitude, 
longitude,  and  height  above  sea  level.  As  can  be  seen  from  Figure  15, 
most  of  these  reports  are  concentrated  along  the  transoceanic  and  U.S. 
transcontinental  flight  paths.  Although  most  of  these  observations  are 
taken  at  cruising  altitudes,  typically  from  300  mb  to  200  mb,  occasional 
reports  are  received  at  lower  or  higher  levels. 

In  practice,  reports  that  are  less  than  500  m  above  sea  level  are 
ignored,  as  are  reports  of  wind  speeds  greater  than  125  m/sec.  Otherwise, 
the  u-  and  y-wind  components  are  computed  from  the  wind  speed  and 
direction,  and  the  preliminary  quality  flags  are  used  to  set  the  analysis 
quality  flag  for  each  wind  report  pair.  A  maximum  of  2000  «,  v  pairs 
can  be  stored,  with  approximately  1000  observation  pairs  available  for 
each  analysis  time.  Since  a  number  of  aircraft  follow  the  same  flight  paths 
and  tend  to  report  at  certain  fixed  latitude/longitude  crossings,  the  data 
are  thinned  by  computing  time-weighted  averages  from  the 
co-located  aircraft  wind  observations.  This  reduces  the  number  available 
at  any  one  analysis  time  to  around  800. 

After  the  wind  components  have  been  computed  and  stored,  but  before 
the  wind  increments  are  computed,  the  observations  are  sorted — first  on 
latitude,  then  longitude,  followed  by  height  and  time.  This  order  ensures 
that  duplicate  observations  will  be  stored  adjacent  to  one  another  in  the 
data  arrays  and  can  be  easily  compared.  Duplicate  reports,  that  is, 
observations  reporting  identical  wind  information  at  the  same  time  and 
same  jc,  y,  z  location,  are  effectively  removed  from  the  data  set  by 
flagging  all  but  one  of  them  as  rejected  reports.  If  two  or  more  observations 
are  reported  at  the  same  time  and  same  location,  but  the  reported  wind 
speeds  and  directions  are  different,  the  entire  co-located  group  of 
observations  is  flagged  for  rejection.  However,  if  multiple  reports  at  the 
same  location  have  the  same  winds  but  are  reported  at  different  times, 
the  most  recent  report  is  assumed  to  be  valid  and  the  others  are  flagged 
for  rejection.  It  is  not  unusual  to  remove  over  150  observations  during 
the  duplicate  check. 

To  reduce  the  redundancy  of  reports  made  by  different  aircraft  at 
different  times,  but  along  the  same  flight  path,  co-located  observations 
are  combined  into  one  s uperob  by  performing  a  time-weighted  average 
of  the  u-  and  y-wind  components  individually.  The  weighting  function 
is  of  the  form 


where  t  ranges  from  -3  to  +3  hours  from  the  analysis  time.  The 
resulting  wind  components  are  computed  as  linear  combinations  of 
the  original  observations  and  their  respective  normalized  weights.  The 
original  observations  are  then  flagged  for  rejection  and  replaced  with 
the  newly  computed  superobs  of  the  u-  and  y-wind  components.  Normally, 
150-200  superob  pairs  are  created,  with  the  majority  resulting  from  the 
combination  of  two  or  three  individual  reports.  However,  we  have  seen 
superobs  computed  from  up  to  eight  co-located  aircraft  observations. 
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3.1.5  Satellite  Cloud-Tracked 
Wind  Data 


3.1.6  Satellite  Sounding  Data 


Other  single-level  wind  observations  are  available  from  U.S.,  European, 
and  Japanese  geostationary  satellites.  These  data  are  referred  to  as  cloud- 
tracked  winds,  indicative  of  the  method  used  to  derive  them.  Because 
of  the  fixed  position  of  the  satellites,  it  is  only  possible  to  accurately 
produce  cloud-tracked  winds  in  the  tropics  and  midiatitudes,  between 
about  50°N  and  50°S.  While  the  number  of  cloud-tracked  winds  is 
more  variable  than  other  types  of  single-level  observations, 
1000-2000  are  usually  available  for  the  0000  UTC,  0600  UTC,  and 
1200  UTC  analysis  times.  The  distribution  of  these  observations 
is  obviously  dependent  upon  cloud  coverage,  but  they  are  fairly  evenly 
distributed  between  the  hemispheres,  with  roughly  half  of  the  observations 
located  in  the  tropics  (Fig.  16).  The  cloud-tracked  winds  tend  to  be 
vertically  clustered,  with  around  50%  reported  below  775  mb  and  about 
one-third  of  the  reports  concentrated  at  jet  levels  (300-200  mb).  The 
remainder  of  the  reports  are  scattered  at  various  levels,  up  to  approximately 
100  mb. 

Up  to  6000  cloud-tracked  wind  observations,  valid  within  3  hours  of 
the  analysis  time,  can  be  stored  for  the  analysis,  with  typical  numbers 
around  1500.  Observations  with  a  reported  pressure  greater  than  1000  mb 
or  less  than  10  mb  are  excluded.  Reported  wind  speeds  of  greater  than 
125  m/sec  are  disregarded.  Observations  flagged  as  erroneous  by  the 
preliminary  quality-control  checks  are  assigned  an  analysis  quality  control 
flag  of  99,  for  rejection.  Remaining  observations  are  checked  for  appro¬ 
priate  ranges  of  values  for  wind  speed,  direction,  and  location,  with  any 
out-of-range  value  causing  an  observation  to  be  flagged  for  rejection. 

Satellites  also  provide  multilevel  observations  in  the  form  of  temperature 
soundings  that  are  derived  from  radiance  measurements  made  by  the 
polar  orbiters.  FNOC  has  access  to  data  from  four  such  satellites, 
two  flown  by  the  National  Oceanic  and  Atmospheric  Administration 
(NOAA-10  and  -11)  and  two  from  the  Defense  Meteorological  Satellite 
Program  (DMSP  F8  and  F9).  However,  if  desired,  data  from  any  individual 
satellite  can  be  excluded,  without  software  changes,  simply  by  specifying 
the  proper  input  parameters  in  the  job  stream  that  executes  the  analysis. 
Data  from  the  remaining  satellites  will  not  be  affected.  With  all  four 
satellites  available,  global  coverage  is  provided  every  6  hours.  Figure  17 
shows  the  typical  coverage  available  for  an  analysis  time.  With  few 
exceptions,  each  satellite  sounding  extends  from  the  surface  to  at  least 
10  mb,  with  the  NOAA  satellite  soundings  extending  to  0.4  mb.  In 
addition,  the  NOAA  satellites  provide  measurements  of  precipitable 
water  from  the  surface  to  300  mb. 

Up  to  4360  soundings  can  be  stored  for  use  in  the  analysis.  Since  this 
is  less  than  the  number  of  available  soundings  over  a  6-hour  analysis 
window,  every  fourth  sounding  is  skipped  as  the  data  are  read,  thereby 
reducing  the  number  of  profiles  while  retaining  good  horizontal  coverage. 
Each  sounding,  as  it  is  passed  to  the  analysis,  contains  multiple  reports 
of  layer  thickness  in  geopotential  meters.  Each  thickness  report  is 
accompanied  by  a  quality  flag  and  by  the  pressures  at  the  top  and 
bottom  of  the  layer.  If  the  layer  so  defined  coincides  with  a  layer 
defined  by  the  analysis  pressure  levels,  the  report  is  saved  for  the  analysis. 
The  only  test  performed  on  the  data  (at  this  stage)  assures  that  the 
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Figure  17.  Typical  6-hour  coverage  provided  by  the  NOAA  (X)  and  DMSP  (O)  satellite  temperature  soundings. 


reported  thickness  is  positive,  The  analysis  quality  flag  is  set  to  0  for 
all  available  thickness  datu;  missing  or  erroneous  layers  are  assigned  a 
quality  flag  of  99.  Lapse-rate  checks  will  also  be  performed  on  the 
satellite  soundings;  these  tests  are  described  in  Section  3.3. 

Since  one  of  the  analysts  levels,  92S  mb,  is  not  a  mandatory  pressure 
level,  we  compute  a  1000-925-mb  thickness  and  a  925-850-mb  thick¬ 
ness  in  a  manner  consistent  with  the  atmospheric  structure  reported  in 
the  lower  layers.  The  reported  850-700-mb  thickness  and  the 
1000-850-mb  thickness  arc  used  in  this  calculation.  If  the  lowest  layer 
reported  is  not  bounded  by  1000  mb,  but  the  bottom  pressure  is  greater 
than  925  mb,  the  calculation  is  still  performed,  but  only  the 
925-850-mb  layer  thickness  is  computed.  First,  from  the  hypsometric 
equation,  we  compute  the  mean  temperatures  of  the  two  reported  layers, 
which  in  turn  are  appropriately  weighted  to  estimate  the  mean  temperature 
of  925-850-mb  layer.  Again  using  the  hypsometric  equation,  the 
925-850-mb  thickness  is  computed  using  the  estimated  mean  temperature 
of  the  layer.  The  1000-925-inb  thickness  is  then  computed  by  subtracting 
the  925-850-mb  thickness  from  the  1000-850-mb  thickness,  when 
available 

Normally,  the  satellite  sounding  data  are  read  from  the  FNOC  SOF 
ream's,  which  contain  the  data  received  at  FNOC  from  Carswell  Air 
Forre  Base.  These  records  include  layer  thicknesses  up  to  1  mb,  where 
available.  However,  FNOC  also  receives  sounding  data  directly  from 
the  National  Environmental  Satellite  Data  and  Information  Service  in 
Washington,  D.C.  These  data  are  stored  in  the  SOG  and  SOH  records, 
and  are  in  the  form  of  layer  mean  temperatures  up  to  50  mb  (SOG)  and 
from  50  to  0.4  mb  (SOH).  If  no  sounding  data  are  found  on  the  SOF 
records,  the  SOG  and  SOH  data  are  used  as  alternatives. 

The  SOG  records  are  processed  in  much  the  same  way  as  the  SOF 
records,  although  the  thickness  must  be  computed  from  the  reported 
layer  mean  temperature  and  the  pressures  bracketing  the  layer.  Mean 
temperatures  from  the  two  lowest  layers  are  used  to  compute  the 
1000-925-mb  and  925-850-mb  thicknesses,  as  described  above.  Other 
layers  reported  in  the  SOG  records  are  also  incompatible  with  the  layers 
defined  by  the  analysis  levels.  Therefore,  the  available  300-200-mb 
thickness  is  decomposed  into  300-250-mb  and  250-200-mb  thick¬ 
nesses,  and  the  reported  200-100-mb  thickness  is  used  to  compute 
200-150-mb  and  150-100-mb  thicknesses.  The  mean  temperature  in  the 
layer  being  subdivided  is  considered  along  with  the  mean  temperature 
of  the  layer  below  to  estimate  the  mean  temperature  of  the  lower  of  the 
two  new  layers.  These  computations  mimic  those  used  for  dividing 
the  1000-850-mb  layer,  and  in  each  case,  the  two  new  layer  thicknesses 
will  add  up  exactly  to  the  original  layer  thickness. 

When  the  SOF  records  are  missing,  the  SOH  records  provide  data  for 
the  upper  layers  of  the  atmosphere.  The  data  are  processed  in  much  the 
same  manner  as  already  described — reported  mean  layer  temperatures 
are  used  to  compute  layer  thickness  in  geopotential  meters.  Only  one 
mismatch  occurs  between  the  reported  layers  and  the  analysis  layers. 
Thus,  the  reported  30-10-mb  mean  layer  temperature  is  used  along 
with  the  50-30-mb  mean  layer  temperature  to  compute  the  30-20-mb 
and20-10-mb  thicknesses.  Finally,  the  thicknesses  above  50  mb  from 
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the  SOH  records  must  be  merged  with  the  soundings  from  the  SOG 
records.  Because  of  the  reduced  number  of  levels  reported,  there  may 
be  twice  as  many  SOH  soundings  stored  as  there  are  SOG.  The  location 
of  each  upper-level  sounding  is  compared  to  the  location  of  each  lower 
level  sounding,  and  if  the  latitude  and  longitude  differences  are  no 
more  than  0.5°  each,  the  data  are  merged  to  provide  a  complete  profile 
of  geopotential  thickness  for  all  analysis  layers,  with  associated  quality 
flags,  thus  mimicking  exactly  the  format  of  the  SOF  soundings. 

3.1.7  SSM/I  Wind  Speed  Data  Estimates  of  the  surface  wind  speed  over  the  oceans  can  be  derived 

from  measurements  made  by  the  Special  Sensor  Microwave/Imager 
(SSM/I),  a  passive  microwave  radiometer  flown  on  board  the  DMSP  F8 
satellite.  These  observations  are  derived  by  taking  advantage  of  the 
dependency  of  emissivity  on  the  state  of  the  sea  surface  (Goodberlet 
et  al.,  1990).  Currently,  FNOC  is  the  only  forecast  center  with  operational 
access  to  the  SSM/I  data.  Within  a  6-hour  period,  over  200,000  wind 
speed  observations  are  derived  from  this  sensor  at  approximately 
20-km  resolution.  Since  this  data  density  is  far  greater  than  that  required 
by  NOGAPS,  the  high-resolution  SSM/I  observations  are  thinned  and 
averaged,  resulting  in  about  2000  observations  at  200-km  resolution. 
Figure  18  illustrates  the  distribution  of  the  thinned  SSM/I  data  available 
for  use  by  a  1200  UTC  analysis.  Relatively  good  global  coverage  is 
available  over  a  24-hour  period. 

The  high-density  SSM/I  observations  are  thinned  during  preprocessing 
so  that  only  every  sixth  observation  is  presented  to  the  analysis.  However, 
even  this  data  density  is  still  much  greater  than  that  required  for  the 
scales  resolved  by  a  global  analysis/forecast  system.  Initially,  the  analysis 
may  read-in  up  to  32,000  individual  wind  speed  observations,  starting 
with  the  current  analysis  time  and  working  in  ±l-hour  increments  over 
the  6-hour  data  window  of  the  analysis.  Since  the  number  of  original 
observations  within  that  time  window  has  already  been  reduced  by  the 
preprocessor,  sufficient  storage  is  available  for  the  remaining  observations. 
As  the  analysis  processes  this  data,  however,  further  information 
consolidation  is  desirable. 

First,  the  individual  wind  speed  reports  are  subjected  to  tests  that 
compare  the  observed  values  to  the  wind  speeds  produced  by 
the  preliminary  1000-mb  wind  analysis  described  in  Section  5.2.  The 
preliminary  field  is  horizontally  interpolated  to  the  location  of  each 
observation  and  the  wind  speed  increments  are  computed  as  the  observed 
minus  the  analyzed  differences.  If  the  analyzed  speed  is  less  than 
4  m/sec,  but  the  observed  wind  speed  is  more  than  3  m/sec  greater  than 
the  preliminary  estimate,  the  SSM/I  observation  is  not  saved.  Since  the 
wind  direction  for  the  SSM/I  data  will  be  assigned  from  this  preliminary 
analysis  field,  we  do  not  want  to  use  a  wind  direction  from  a  suspected 
light  and  variable  wind  condition  to  assign  a  direction  to  a  stronger 
observed  wind  speed.  If  the  analyzed  winds  are  calm,  and  thus 
have  no  direction,  we  must  also  reject  the  SSM/I  observation.  In 
situations  where  the  analyzed  wind  speed  is  between  4  and  10  m/sec, 
SSM/I  data  are  rejected  only  if  the  absolute  value  of  the  increment 
exceeds  7.5  m/sec.  For  stronger  wind  conditions,  defined  as 
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Figure  18.  Typical  6-hour  coverage  provided  by  the  averaged  SSM/I  wind  speed  observations. 


analyzed  speeds  in  excess  of  10  m/sec,  an  absolute  difference  of  up  to 
10  m/sec  is  allowed. 

The  wind  speed  increments  that  pass  these  quality  control  checks  are 
binned  and  averaged  to  form  SSM/I  s uperob  increments.  The  increments, 
rather  than  the  full  field  values,  are  used  to  compute  the  superobs  because 
they  are  more  horizontally  homogeneous.  Each  superob  is  assigned  a 
latitude  and  a  longitude  from  the  center  location  of  each  bin.  The  bin 
locations  are  fixed,  with  each  bin  encompassing  2°  of  latitude  and  varying 
degrees  of  longitude,  in  such  a  way  that  the  bins  extend  over  roughly 
the  same  physical  distance  in  both  the  north-south  and  east-west  directions. 
This  layout  results  in  SSM/I  superobs  with  a  data  density  of  approximately 
200  km.  One  last  caveat  is  that  there  must  be  at  least  three  independent 
observations  in  a  bin  to  form  a  superob.  If  there  are  only  one  or  two 
increments  in  a  bin,  they  will  be  ignored. 

The  final  step  is  to  assign  a  wind  direction  to  each  superob  wind 
speed  increment.  The  u-  and  rewind  components  from  the  preliminary 
1000-mb  analysis  field  are  horizontally  interpolated  to  the  superob 
locations.  From  these  components,  the  analyzed  wind  speed  is  computed  at 
each  location  and  added  to  the  superob  increment  to  create  a  full-field 
wind  speed  superob.  Then,  the  u-  and  rewind  components  of  the  superob 
are  computed  by  multiplying  the  wind  components  of  the  preliminary 
analysis  by  the  ratio  of  this  superob  wind  speed  to  the  preliminary  analysis 
wind  speed.  From  this  point  on,  the  SSM/I  superob  components  are 
treated  like  any  other  full-field  u-  and  rewind  observations.  Any  prior 
processing  using  the  preliminary  1000-mb  analysis  is  ignored.  The  analysis 
quality  flag  associated  with  each  superob  wind  pair  is  set  to  10,  for 
now. 

3.1.8  Synthetic  Data  Finally,  the  global  database  is  supplemented  by  the  creation  of  synthetic 

observations  for  specific  applications.  The  use  of  synthetic 
surface  observations  is  common  practice  at  the  global  weather  forecast 
centers.  The  most  commonly  used  synthetic  observations  are  the 
subjectively  derived  estimates  of  sea-level  pressure  in  the  Southern 
Hemisphere  (PAOBS) — observations  produced  by  the  Australian  Bureau 
of  Meteorology  (Guymer,  1978)  and  provided  to  the  other  operational 
centers.  These  observations  are  available  at  FNOC  for  the  0000  UTC 
and  1200  UTC  analyses,  and  their  typical  distribution  is  shown  in 
Figure  19.  The  PAOBS  are  checked  during  the  preliminary  quality  control, 
and  are  subsequently  processed  with  the  other  surface  data  as  described. 

Locally,  FNOC  personnel  generate  synthetic  observations  of  sea-level 
pressure  and  surface  winds  in  the  vicinity  of  oceanic  extratropical  cyclones. 
Typically,  such  observations  are  created  in  relatively  data-sparse  areas 
like  the  North  Pacific,  in  situations  where  satellite  imagery  indicates 
that  the  cyclone  is  deeper  than  its  depiction  by  NOGAPS.  This  type  of 
synthetic  observation  is  routinely  used  in  both  the  0000  UTC  and 
1200  UTC  analyses  (Goerss,  1989).  The  synthetic  observations  are 
bypassed  by  the  preliminary  checks.  However,  these  data  are  stored  and 
processed  with  the  other  surface  reports  and  will  therefore  be  subjected 
to  the  analysis  quality  checks. 

In  June  1990,  FNOC  also  began  the  assimilation  of  synthetic  wind 
soundings  in  the  vicinity  of  tropical  storms.  These  soundings  are 
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3.2  Observation  Increments 


3.3  Analysis  Gross-Error 
Checks 


automatically  generated  from  the  warnings  issued  by  the  Joint 
Typhoon  Warning  Center  (JTWC),  Guam,  and  the  National  Hurricane 
Center  (NHC),  Miami,  for  tropical  cyclones  whose  maximum  wind  speeds 
are  35  kt  or  greater.  The  synthetic  soundings  are  centered  on  the  storm’s 
location  and  reflect  the  maximum  wind  speed  and  radius  of  maximum 
wind  information  contained  in  the  warning  message.  The  preliminary 
quality  checks  are  bypassed,  and  the  synthetic  soundings  are  subsequently 
processed  by  the  analysis  along  with  the  radiosonde  observations. 


The  MVOI  is  an  incremental  analysis;  that  is,  it  uses  information 
from  current  observations  to  update  the  background  fields  p.ovided  by 
the  forecast  model.  The  current  observations  generally  include  all  data 
within  ±3  hours  of  the  analysis  time,  although  the  time  window  may 
vary  depending  upon  the  particular  application  of  the  analysis 
(Section  5).  Once  the  data  within  this  time  window  have  been  processed 
and  converted  to  observations  of  geopotential  height,  geopotential 
thickness,  and  u-  and  v-wind  components,  they  must  be  transformed 
into  observed  increments  of  the  same  variables.  The  procedures  used  to 
form  the  increments  are  more  or  less  the  same  for  all  data  types.  The 
6-hour  forecast  fields  from  the  NOGAPS  spectral  model  are  interpolated 
to  the  analysis  grid.  Franke  (1985)  showed  that  significant  error  can  be 
introduced  at  this  stage  in  the  analysis  if  simple  bilinear  interpolation 
is  used.  Thus,  the  appropriate  background  field  is  horizontally  interpolated 
to  each  observation  location  using  a  bicubic  spline  interpolator.  If  the 
observation  is  reported  at  a  defined  analysis  pressure  level,  no  vertical 
interpolation  is  necessary.  The  observation  minus  background  difference 
is  computed,  and  the  observed  value  is  replaced  with  the  incremental 
value. 

There  are  a  few  exceptions  to  the  general  procedure:  (a)  Aircraft  and 
cloud-tracked  wind  observations  are  not  necessarily  at  the  analysis  pressure 
levels.  Thus,  the  horizontal  interpolation  of  the  background  field  must 
be  done  at  the  levels  above  and  below  the  off-level  observation.  The 
two  values  at  the  observation’s  horizontal  location  are  then  vertically 
interpolated  to  the  level  of  the  observation.  The  vertical  interpolation  is 
linear  with  respect  to  pressure,  (b)  Since  thickness  is  not  a  model  output 
field,  the  height  background  fields  are  differenced  to  provide  the  thickness 
background  fields  for  each  layer.  These  fields  can  then  be  horizontally 
interpolated  to  the  satellite  sounding  locations  to  compute  the  thickness 
increments,  (c)  The  marine  surface  wind  components  from  ships,  buoys, 
and  the  SSM/I  are  differenced  with  the  10-m  wind  field  from  the  model, 
valid  at  the  analysis  time,  rather  than  the  1000-mb  wind  field.  The 
10-m  wind  field  is  a  NOGAPS  by-product  that  is  produced  primarily  to 
provide  forcing  data  for  the  FNOC  oceanographic  models.  The  10-m 
winds  agree  more  closely  with  the  surface  wind  reports  than  do  the 
1000-mb  winds.  Then,  assuming  that  the  wind  increments  at  10  m  are 
not  much  different  than  the  increments  at  1000  mb,  we  simply  assign 
these  surface  wind  increments  to  the  first  analysis  level. 


In  practice,  very  few,  if  any,  observations  are  excluded  during  the 
data  processing  described.  The  preliminary  checks  performed  on  the  data 
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before  they  are  passed  to  the  analysis  effectively  eliminate  or  flag  the 
observations  with  values  that  are  obviously  out  of  range.  Similar  checks 
in  the  analysis  are  therefore  redundant  and  are  performed  more  as  a 
fail-safe  measure  than  anything  else.  However,  in  an  operational  system, 
a  certain  amount  of  redundancy  is  desirable,  since  even  a  small  data 
glitch  can  cause  an  entire  system  to  crash.  From  experience,  we  have 
learned  to  always  expect  the  unexpected,  and  we  design  our  systems 
accordingly. 

The  remaining  analysis  quality  control  algorithms  are  designed  to 
identify  observations  that  are  erroneous  even  though  their  values  may 
appear  to  be  perfectly  reasonable.  These  quality  tests  are  therefore 
performed  after  the  observed  increments  have  been  computed  and  always 
use  the  absolute  difference  of  the  observation  from  the  background. 
The  first  quality  check  made  on  the  increments  is  the  check  for  gross 
errors,  and  only  increments  that  have  already  been  flagged  for  rejection 
or  increments  from  the  synthetic  tropical  soundings  escape  this  test. 
Keeping  in  mind  that  the  short-range  model  forecasts  have  become  very 
accurate,  we  essentially  use  the  background  field  as  a  control  on  the 
reasonableness  of  the  data. 

The  amount  of  deviation  from  the  background  that  will  be  tolerated 
varies  from  one  data  type  to  another,  and  is  a  function  of  both 

~o  ~  p 

the  observation  error  E  and  the  prediction  error  E  associated  with  the 
observation  and  its  location.  The  appropriate  values  for  E°  and  Ep  are 
estimated  using  the  techniques  described  in  Section  2.2.  The  observation 
errors  currently  assigned  to  each  data  type  are  shown  in  Table  2.  The 
thickness  errors  given  are  for  the  NOAA  retrievals.  For  DMSP  retrievals, 
these  values  are  multiplied  by  1.5  in  the  northern  hemisphere  above 
70  mb  and  by  2.0  in  the  southern  hemisphere  above  100  mb.  Otherwise, 
the  observation  errors  do  not  vary  from  one  horizontal  location  to  another. 

To  perform  the  gross-error  check  for  each  observed  increment,  we 
define  the  quantity  7}/ 

T«  =\Hf +  {Ef\m .  (50) 

where  EP/  is  the  prediction  error  and  E#  is  the  observation  error  for 
the  increment  at  location  i  and  level  /.  Both  errors  are  functions  of  the 
level,  and  the  observation  error  is  a  function  of  the  data  type.  While 
the  prediction  errors  used  in  the  analysis  are  also  a  function  of  horizontal 
location,  these  variations  have  not  yet  been  defined.  Thus,  the  value  of 
Ept  used  for  the  gross-error  check  is  allowed  to  vary  vertically  but  not 
horizontally.  The  values  used  are  those  shown  in  Table  3,  where  the 
height  and  thickness  errors  are  derived  from  the  wind  errors  using 
equation  (48)  with  a  value  of  the  Coriolis  parameter  valid  at  60°. 

Larger  values  of  either  observation  error  or  rediction  error  result  in 
larger  values  of  T.  Thus,  equation  (50)  implies  that  data  types  known 
to  have  large  errors  are  given  a  little  more  leeway  before  they  are 
rejected.  While  this  may  seem  contradictory,  it  reflects  the  desire  to 
differentiate  between  anomalies  due  to  instrument  inaccuracy  and 
anomalies  that  reflect  truly  erroneous  data.  Similarly,  we  also  tolerate 
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Pres  (mb) 


Gpt  Hgt  4  (ypm) 


Raob 


Table  2.  NOGAPS  observation  errors. 


U  or  V  Wind  (m/sec) 


NOAA  Thickness  (gpm) 


Gpt  Helgnt  4  (gpm) 


Sounding  Aircraft  Cld-track  I  Pres  (mb)  Clear  Pt  Cld  Cloudy 


1000-925  101  151 
925-650  85  123 
850-700  167  250 
700-500  241  360 


500-400  128  192 
400-300  165  248 
300-250  105  158 
^50-200  128  192 


200-150  165  248 

150-100  233  350 

100-70  205  308 

70  -iO  193  290 


293  440 
233  350 
398  597 


U  or  V  Wind  (m/sec) 


Surface 


Table  3.  NOGAPS  prediction  errors. 
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larger  absolute  increments  in  areas  with  higher  prediction  errors,  reflecting 
the  fact  that  the  background  is  more  likely  in  error. 

To  actually  perform  the  error-checking,  we  specify  tolerances  that 
are  multiples  of  7',  which  define  a  range  of  values  where  the  observed 
increment  is  considered  likely  to  be  erroneous,  and  a  limiting  value, 
beyond  which  the  increment  will  be  rejected.  If  an  increment  is  accepted 
as  accurate  by  the  gross-error  check,  it  will  receive  no  further  checking 
by  the  analysis,  unless  it  was  flagged  as  suspect  during  the  preliminary 
checks.  Under  no  circumstances  will  the  analysis  quality  flag  of  a 
questionable  observation  be  reset  to  indicate  that  the  observation  is 
good.  Quality  flags  can  only  be  changed  in  the  negative  sense. 

The  defined  tolerances  vary  with  data  type  and  location  and  are 
summarized  in  Tables  4-6.  The  most  straightforward  tests  are  performed 
on  wind  reports  (Table  4).  Radiosonde,  pibal,  SSM/I,  marine  surface, 
and  aircraft  wind  increment  pairs  are  rejected  if  either  the  u-  or  rewind 
increment  exceeds  3.5  T,  and  the  wind  pair  is  flagged  for  further  checking 
if  either  increment  exceeds  2.5  T.  The  off-level  wind  reports — aircraft 
and  cloud-tracked  winds — are  compared  to  the  value  of  T  that  is  valid 
at  the  analysis  level  at  or  immediately  below  the  observed  pressure. 
The  satellite  cloud-tracked  wind  data  are  subjected  to  the  most  stringent 
criteria,  for  they  are  rejected  if  either  wind  increment  exceeds  1.5  T. 
Cloud-tracked  winds  are  either  accepted  or  rejected  at  this  point  in  the 
analysis;  they  are  not  flagged  for  further  checking. 

As  an  illustration,  consider  a  set  of  wind  observations  from  various 
sources,  all  at  one  pressure  level  and  with  the  prediction  error  the  same 
at  each  observation  location.  Since  the  contribution  of  the  prediction 
error  to  T  will  be  constant  for  all  the  observations,  variations  in  T  will 
be  solely  a  function  of  the  observation  error.  Using  the  appropriate 
error  values  for  250-mb  wind  data  from  Tables  2  and  3,  computed 
values  of  T  vary  from  4.9  m/sec  for  radiosonde  and  pibal  winds  to  a 
high  of  7.5  m/sec  for  cloud-tracked  winds.  However,  since  the  tolerances 
are  a  function  not  only  of  T,  but  also  vary  with  data  type,  we  find  that 
radiosonde  and  pibal  winds  will  be  rejected  if  either  the  n-  or  rewind 
observation  differs  from  the  background  by  more  than  17.1  m/sec.  The 
comparable  limit  is  19.3  m/sec  for  aircraft  winds,  and  only  11.3  m/sec 
for  cloud-tracked  winds.  With  the  exception,  perhaps,  of  the  cloud-tracked 


Table  4.  Gross-error  check  wind  tolerances. 


T  Varies  with  Level  and  Data  Type 

Data  Increment  Type 

No  Checking 

Further  Checking 

Rejection 

Aircraft  Wind 

<2.5  T 

2. 5-3.5  r 

>3.5  T 

SSM/I  Wind 

<2.5  T 

2, 5-3.5  r 

>3.5  T 

Ship/Buoy  Wind 

<2.5  T 

2.5-3.5  T 

>3.5  T 

Radiosonde  Wind 

<2.5  T 

2. 5-3.5  T 

>3.5  r 

Pibal  Wind 

<2.5  T 

2.5-3. 5  T 

>3.5  T 

Cloud-Track  Wind 

SI  .5  T 

None 

>1.5  T 

42 


MVOl  Analysis  of  Meteorological  Data  at  FNOC 


winds,  rather  large  deviations  from  the  background  are  still  accepted 
by  the  gross-error  check,  by  design.  Absolute  increments  smaller  than 

12.2  m/sec  for  radiosondes  and  pibals,  13.8  m/sec  for  aircraft,  and 

11.3  m/sec  for  cloud-tracked  winds  will  be  accepted  as  is,  unless  they 
were  flagged  as  suspect  during  the  preliminary  checks.  Values  in  between 
will  be  subjected  to  comparisons  with  surrounding  observations  later  in 
the  analysis. 

The  tolerances  specified  for  conventional  radiosonde  and  surface  height 
observations  are  more  variable,  since  they  are  adjusted  according  to  the 
latitude  of  the  observation  (Table  5).  Surface  height  increments  are 
flagged  for  additional  checking  if  their  absolute  value  exceeds  3.0  7”, 
where  T  =  0.8  T  for  observations  between  -30°  and  30°N  and  T'=T 
elsewhere.  Similarly,  increments  exceeding  5.0  T'  are  rejected,  but  only 
if  they  are  over  land  or  between  -25°  and  25°N,  Marine  height  observations 
in  the  midlatitudes  and  polar  regions  cannot  be  rejected  by  the  gross - 
error  checks.  However,  they  can  be  flagged  for  additional  checking. 
The  PAOBS  are  processed  along  with  the  surface  data,  but  the  tolerances 


Table  5.  Gross-error  check  height  tolerances. 


T  Varies  with  l.evel  and  Data  Type 

Data  Increment  Type 

No  Checking 

Further  Checking 

Rejection 

Land  Surface  Height 
Extra-Tropics 

<3.0  T 

3.0-5.0  T 

Tropics 

<2.4  T 

2.4-4.0  T 

ns 

Marine  Surface  Height 
Extra-Tropics 

<3.0  T 

' 

None 

Tropics 

<2.4  T 

>4.0  T 

Australian  PAOB  Height 
Extra-Tropics 

m 

56.0  r 

None 

Tropics 

4.8-8.0  T 

>8.0  T 

Radiosonde  Height 

Polar  Areas 

Below  30  mb 

<4.8  T 

4.8-8.0  T 

>8.0  T 

30  to  20  mb 

<7.7  T 

7.7-12.8  T 

>12.8  T 

10  mb 

Midlatitudes 

<11.5  T 

11.5-19.2  T 

>19.2  T 

Below  30  mb 

<3.0  T 

>5.0  T 

30  to  20  mb 

<4.8  T 

4.8-8.0  T 

>8.0  T 

10  mb 

<7.2  T 

7.2-12.0  r 

>12.0  r 

Tropics 

Below  30  mb 

<2.4  T 

2. 4-4.0  r 

>4.0  r 

30  to  20  mb 

<3.8  T 

3.8-6.4  T 

>6.4  r 

10  mb 

<5.8  T 

5. 8-9.6  T 

>9.6  r 

Synthetic  Radiosonde 
Midlatitudes 

Below  30  mb 

m 

3.0-5. 0  T 

■ 

30  to  20  mb 

1 

4. 8-8.0  T 

10  mb 

7.2-12.0  T 

>12.0  r 

Tropics 

All 

None 

None 
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are  increased  by  a  multiple  of  2  for  these  synthetic  observations,  thus 
allowing  larger  increments  to  pass  the  gross-error  check. 

The  rejection  and  checking  tolerances  for  radiosonde  height  data  are 
equal  to  5.0  T"  and  3.0  T",  respectively.  Several  factors  are  considered 
in  defining  T".  First  we  define  F  as  a  function  of  latitude,  with 
F  =  0.8  T  between  latitudes  -30°  and  30°,  F  =  T,  and  T'=\.6T poleward 
of  latitudes  -60°  and  60°.  Thus,  observations  in  the  polar  regions  are 
given  more  leeway  than  those  in  the  mid-latitudes  and  tropics,  thereby 
accounting  for  the  increased  magnitude  of  the  background  errors  in  the 
polar  areas.  Then,  T"  is  used  to  allow  further  variations  of  the  tolerances 
with  height,  over  and  above  the  variations  induced  by  the  observation 
and  prediction  errors.  This  increase  in  tolerance  is  applied  only  at 
30  mb  and  above,  with  F'=1.6F  at  30-20  mb  and  T"  =  2.4  F 
at  10  mb,  the  uppermost  analysis  level.  This  final  increase  was 
introduced  to  control  the  occasional  event  where  the  spectral  forecast 
model  displays  especially  large  errois  at  the  upper  levels.  In  these  cases 
we  do  not  want  radiosonde  data  to  be  rejected.  We  have  found  that 
being  overly  tolerant  of  the  data  at  these  levels  does  no  harm. 

Even  though  our  theoretical  development  of  the  MVOI  assumes  that 
observation  errors  are  uncorrelated,  in  reality,  observation  errors  for 
satellite  thickness  increments  tend  to  be  highly  correlated  horizontally, 
especially  for  observations  from  the  same  satellite.  Because  of  this 
correlation,  erroneous  observations  would  lend  support  to  each  other 
when  subjected  to  the  corroboration  check  by  the  analysis.  Therefore, 
more  stringent  rejection  tolerances  are  used  for  the  gross-error  check  of 
thickness  data,  with  no  provision  for  further  checking  by  the  analysis 
(Table  6).  The  rejection  tolerance  is  set  at  1.5  T".  The  variation  with 
latitude,  defined  by  F,  is  exactly  the  same  as  for  the  radiosondes. 
Additional  variations  with  height  are  defined  for  the  layers  above  50  mb, 
with  F'=1.6F  for  the  50-30-mb  and  30-20-mb  layers  and 
T"  =  2.4  F  for  the  20-10-mb  layer.  After  these  initial  checks  are  made, 


Table  6.  Gross-error  check  thickness  tolerances. 


T  Varies  with  Level  and  Data  Type 

Data  Increment  Type 

No  Checking 

Further  Checking 

Rejection 

Satellite  Thickness 

Polar  Areas 

Below  50  mb 

£2.4  T 

None 

>2.4  T 

50  to  20  mb 

£3.8  r 

None 

>3.8  T 

20  to  10  mb 

£5.8  r 

None 

>5.8  T 

Extra-Tropics 

Below  50  mb 

£1.5  T 

None 

>1.5  T 

50  to  20  mb 

£2.4  r 

None 

>2.4  r 

20  to  10  mb 

£3.6  T 

None 

>3.6  r 

Tropics 

Below  50  mb 

£1.2  T 

None 

>1.2  T 

50  to  20  mb 

£1.9  T 

None 

>1.9  T 

20  to  10  mb 

£2.9  T 

None 

>2.9  T 
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the  satellite  sounding  data  that  are  not  rejected  are  then  subjected  to 
lapse-rate  checks. 

Lapse-rate  checks  are  required  because  of  the  inherently  poor  vertical 
resolution  of  the  derived  measurements,  which  results  in  satellite  retrieval 
thickness  errors  in  the  lower  troposphere  that  are  often  negatively 
correlated  with  errors  in  the  upper  troposphere.  In  situations  where  the 
actual  atmosphere  is  unstable  (temperature  decreasing  more  rapidly  with 
height  than  normal),  the  satellite  soundings  tend  to  be  too  stable. 
The  opposite  is  true  in  stable  atmospheric  conditions.  In  either  case,  the 
vertical  temperature  structure  of  the  atmosphere  is  almost  always 
depicted  more  accurately  by  the  analysis  background.  We  have  found 
that  in  the  more  extreme  situations  the  absolute  difference  between  the 
lower  tropospheric  thickness  increments  and  those  from  the  upper 
troposphere  is  large.  The  lapse-rate  checks  are  designed  to  identify 
those  cases  where  the  use  of  the  satellite  soundings  would  impact 
negatively  upon  the  analysis  and  the  ensuing  model  forecast. 

The  lapse-rate  checks  are  only  performed  for  soundings  where  the 
1000-925-mb  thickness  is  available,  thereby  disregarding  many  of 
the  soundings  over  land.  The  total  thickness  increments  for  1000-700  mb 
and  500-300  mb  are  computed  by  summing  the  reported  intermediate 
layers.  Allowing  for  the  fact  that  the  1000-700-mb  layer  represents 
approximately  1.43  times  the  mass  of  the  500-300-mb  layer,  the  lower 
layer  is  multiplied  by  1.43  before  the  upper  layer  value  is  subtracted 
from  it.  The  allowed  tolerance  for  this  difference  is  estimated  by  summing 
the  thickness  prediction  errors  for  the  three  layers  from  1000  to  700  mb, 
multiplying  by  0.58  and  adding  this  result  to  the  thickness  prediction 
errors  for  the  two  layers  from  500  to  300  mb.  Using  the  values  shown 
in  Table  3,  this  results  in  a  tolerance  of  422  gpm.  This  intermediate 
value  is  multiplied  by  1.333  to  get  the  absolute  difference  of  563  gpm. 
Any  sounding  whose  difference  exceeds  this  value  has  the  lowest  six 
layers  of  the  sounding  (essentially  the  tropospheric  part)  rejected.  For 
soundings  with  differences  between  422  and  563  gpm,  the  tropospheric 
part  of  every  other  sounding  is  rejected.  The  lapse-rate  check  does  not 
affect  any  portion  of  the  sounding  above  300  mb.  It  is  possible  to  adjust 
the  multiplicative  factors  of  0.58  and  1.333  through  the  executable  job 
stream,  thus  requiring  no  software  changes. 

Table  7  illustrates  the  typical  number  of  observations  flagged  as 
either  suspicious  or  bad  by  the  preliminary  checks  and  the  analysis 
gross-error  check.  The  largest  rejection  rates  are  associated  with  satellite 
wind  and  sounding  data,  consistent  with  the  fact  that  these  observations 
receive  no  further  checking  by  the  analysis.  The  following  numbers 
are  ail  approximate,  but  typically  15-20%  of  the  cloud-tracked  winds  are 
rejected  by  the  gross-error  check.  About  7%  of  the  thickness  observations 
are  rejected  between  both  the  gross-error  and  lapse-rate  checks.  For  the 
other  data  types,  the  number  of  individual  observations  rejected  by 
the  gross-error  check  is  very  small.  Only  1%  of  the  surface  reports  and 
less  than  2%  of  the  radiosonde  observations  are  considered  erroneous; 
these  numbers  hold  true  for  both  wind  and  height  data.  Wind  reports 
from  pibals  and  from  aircraft  are  rejected  at  the  rate  of  4-5%.  The 
number  of  questionable  observations  is  only  slightly  larger.  About  3% 
of  the  conventional  surface  and  radiosonde  wind  and  height  reports  are 
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Table  7.  Quality  control  results. 


4.0  Analysis  Process 


4.1  Analysis  Volumes 


Following  Preliminary  Checka  and  Anatyala  Groat-Error  Cl>acka 

Data  Increment  Type 

Percent  Rejected 

Percent  Suspected 

Aircraft  Wind 

4-5% 

4-5% 

Pibal  Wind 

4-5% 

a% 

Aadloaonde  Wind 

2% 

3% 

Cloud-Tracked  Wind 

15-20% 

0% 

Marine  Surface  Wind 

1% 

3% 

Radloaonde  Height 

2% 

3% 

Surface  Height 

1% 

3% 

Satellite  Thickneae 

7% 

0% 

flagged  for  further  checking.  Aircraft  wind  observations  undergo  further 
scrutiny  about  4-5%  of  the  time,  and  pibai  winds  are  considered  suspect 
in  about  8%  of  the  cases. 

In  practice,  no  observation  is  ever  removed  from  the  data  set.  Instead, 
the  analysis  quality  flag  is  reset,  as  necessary,  to  indicate  those 
increments  that  have  been  rejected  and  the  values  that  are  still  suspicious. 
The  analysis  will  then  ignore  the  data  with  rejection  flags,  and  the 
questionable  values  will  be  subjected  to  yet  another  quality  check  within 
the  analysis  to  determine  if  they  are  consistent  with  other  nearby 
increments  (Section  4.4).  Increments  that  pass  both  the  preliminary  quality 
checks  and  the  gross-error  checks  will  undergo  no  further  quality  control 
and  will  influence  the  outcome  of  the  analysis  at  surrounding  grid  points. 
The  analysis  quality  flags  are  output  along  with  the  original  observations 
and  saved  in  the  global  database,  which  allows  us  to  monitor  the 
performance  of  individual  platforms,  countries,  and  data  types  for  routinely 
large  biases  or  errors. 


While  Section  2.0  emphasized  the  specification  of  the  statistical 
parameters  for  the  MVOI,  it  is  obvious  from  equation  (1)  that  the  analyzed 
increment  at  a  point  k  is  also  determined  by  the  particular  set  of 
observations  that  are  used  to  form  the  summation  on  the  right-side 
of  that  equation.  To  optimize  the  selection  of  the  observations, 
the  Navy’s  MVOI  analysis  scheme  has  been  patterned  after  the 
volume  method  introduced  by  Lorenc  (1981).  The  major  advantage  of 
the  volume  method  is  that  it  permits  the  production  of  optimum 
interpolation  analyses  at  a  large  number  of  grid  points  from  one  set  of 
observations,  so  that  the  necessary  matrix  and  vector  calculations  need 
only  be  performed  once.  This  increased  efficiency  allows  the  use  of 
many  more  observations  for  each  analysis  point  than  is  computationally 
feasible  with  the  grid-point  method  (DiMego,  1988).  Furthermore,  it 
simplifies  the  data  selection  process  and  allows  for  rigorous  quality 
control  of  the  observation  increments. 

One  of  the  most  important  considerations  in  the  design  of  the  volume 
method  for  the  MV  ."'I  is  the  selection  of  the  analysis  volumes  themselves. 
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By  our  definition,  a  volume  represents  not  only  a  set  of  grid  points,  but 
also  the  physical  space  containing  the  set  of  observations  that  will 
influence  the  analysis  at  these  grid  points.  Due  to  software  constraints, 
the  number  of  observations  within  any  given  volume  cunnot  exceed 
some  specified  maximum,  Thus,  to  ensure  u  sufficient  set  of  observations 
within  each  volume,  it  is  dcsirublc  to  have  the  volume  size  vary  with 
data  density.  Fortunately,  the  distribution  of  data  around  the  globe  is 
fairly  constant  in  time,  which  allows  us  to  specify  the  volume  sizes  in 
some  predetermined  fashion. 

First,  the  globe  is  divided  into  a  set  of  838  overlapping  volumes  that 
vary  in  horizontal  extent  (Fig.  20),  but  are  fixed  in  vertical  extent.  Each 
volume  contains  all  16  analysis  pressure  levels,  thus  entirely  eliminating 
the  need  for  vertical  averaging.  The  polar  caps  comprise  two  of  the 
volumes  and,  due  to  the  convergence  of  the  meridians,  account  for  a 
disproportionate  number  of  grid  points  with  respect  to  the  actual  physical 
areas  they  encompass.  The  polar  cap  volumes  are  chosen  to  be  as  large 
as  possible,  given  the  typical  distribution  of  observations  in  each  region. 
The  south  polar  cap  extends  to  79.9°S  while  the  north  polar  cap  extends 
to  81.4°N.  Thus,  over  10%  of  the  global  analysis  grid  points  are  contained 
in  these  two  volumes. 

Over  the  remainder  of  the  globe,  the  analysis  volumes  are  arranged 
in  overlapping  latitudinal  strips.  The  north-south  extent  of  these  strips 
is  18°  in  the  Southern  Hemisphere  and  9°  in  the  Northern  Hemisphere, 
and  each  strip  overlaps  one-half  of  the  adjacent  strips  to  the  north  and 
south  of  it.  Within  each  strip,  the  longitudinal  extent  of  each  volume  is 
varied  so  that  the  east-west  distance  covered  by  a  volume  is  less  in 
areas  rich  in  conventional  data  than  in  areas  sparsely  covered  by 
conventional  data  (generally  over  the  oceans).  The  volumes  within  each 
strip  also  overlap  one  another,  with  each  volume  extending  from  the 
center  of  the  volume  west  of  it  to  the  center  of  the  volume  east  of  it. 
Thus,  with  the  exception  of  the  polar  caps,  each  grid  point  is  included 
within  four  separate  volumes,  resulting  in  four  separate  estimates  for 
the  analyzed  increment  at  each  point.  The  final  result  at  each  point  is 
obtained  by  computing  the  appropriate  weighted  sums  of  the  various 
analyzed  increments  at  the  point.  The  overlapping  volumes  and 
the  resulting  averaging  process  for  the  analyzed  increments  prevent  the 
introduction  of  any  false  gradients  or  discontinuities  at  the  volume 
boundaries. 

4.2  Data  Selection  Since  all  ({>,  u,  v,  and  A<(>  observations  within  a  volume  are  used  to 

estimate  the  analyzed  <j>,  u,  and  v  increments  at  each  grid  point  in  the 
volume,  it  is  also  very  important  to  ensure  the  proper  distribution 
of  observations  within  each  analysis  volume.  A  representative  number  of 
observations  at  each  analysis  level  is  desirable,  as  well  as  a  reasonable 
horizontal  distribution,  with  an  appropriate  mixture  of  mass  and  wind 
observations.  First  priority  is  given  to  conventional  radiosonde,  pibal, 
and  surface  observations,  followed  by  SSM/I  winds,  satellite  temperature 
soundings,  aircraft,  and  cloud-tracked  winds,  in  that  order.  Satellite 
thicknesses  are  used  only  above  500  mb  for  volumes  rich  in  conventional 
data,  while  in  areas  especially  rich  in  conventional  data,  thicknesses  are 
included  only  above  200  mb. 
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The  total  number  of  observations  allowed  for  each  volume  (counting 
(}>,  «,  v ,  and  A4>  separately)  cannot  exceed  360,  and  typically  averages 
around  300.  However,  the  areal  extent  of  the  volume  is  symmetrically 
expanded  if  the  total  number  of  observations  found  is  less  than  60,  but 
only  for  the  purpose  of  including  more  observations.  The  analysis  itself 
is  still  performed  only  for  the  grid  points  contained  within  the  original 
volume.  A  maximum  of  two  expansions  is  permitted,  with  each  expansion 
increasing  the  east-west  and  north-south  extent  of  the  volume  by  50% 
of  the  original  distances.  For  example,  an  original  volume  that  was 
1000  km  on  a  side  could  be  expanded  to  as  large  as  2000  km  on  a  side. 
Typically,  volume  expansion  is  only  required  when  normal  amounts  of 
data  are  not  received  prior  to  the  analysis. 

All  data  types  are  sorted  by  latitude  and  scanned  in  order  of  their 
priority  to  find  the  observations  in  each  analysis  volume.  To  ensure  the 
proper  mix  of  data  types,  limits  are  imposed  on  the  number  of  observations 
of  each  type  that  can  be  included  in  a  given  volume  (Table  8).  The  data 
collection  process  begins  with  the  first  observation  in  a  set,  and  proceeds 
to  gather  every  nth  observation,  where  n  ranges  from  1  to  5  based  on 
the  observational  density  of  the  data  type.  The  same  procedure  is  followed 
again,  this  time  beginning  with  the  second  observation  and  selecting 
every  nth  observation.  These  iterations  continue  until  all  the  observations 
of  one  data  type  in  the  given  volume  have  been  collected,  or  until  th^ 
limit  for  that  data  type  is  reached.  We  then  proceed  to  the  next  data 
type,  and  repeat  the  process.  Because  of  the  prior  sorting,  this  iterative 
collection  procedure  ensures  that  each  data  type  will  be  uniformly 
distributed  throughout  the  volume. 

As  Table  8  illustrates,  we  still  rely  heavily  on  the  radiosonde  data 
when  it  is  available.  Since  there  are  so  few  pibal  soundings  in  any 
given  volume,  it  is  not  necessary  to  place  any  restrictions  on  their 
number;  all  of  them  are  included.  The  number  of  thickness  observations 
from  satellites  will  be  reduced  from  252  to  126  in  volumes  where  other 
sounding  data  are  plentiful.  This  reduction  is  consistent  with  the  fact 
that  we  ignore  the  lower-level  portion  of  the  satellite  soundings  in  areas 
rich  in  conventional  data,  typically  over  land.  Over  the  oceans,  the 
252  observation  limit  on  the  thicknesses  ensures  that  space  is  available 
for  aircraft  and  cloud-tracked  wind  observations. 

4.3  Determination  of  Once  the  A'-observed  increments  for  a  volume  have  been  collected, 

the  Weights  the  weights  must  be  computed  for  each  increment  relative  to  each  grid 

point  in  the  volume.  Recall  from  the  earlier  discussion  that  the  weight 
wik  is  dependent  upon  three  statistical  quantities — the  observation 
error,  the  prediction  error,  and  the  prediction  error  correlations,  where 
the  correlations  are  computed  between  the  observation  location  i  and  the 
locations  of  the  other  N  -  1  observations  in  the  volume,  and  between 


Table  8.  Observation  limits  per  volume. 


Counts  Include  Each  Individual  u,  v,  and 

Total 

Radiosonde 

Pibal 

Surface 

SSM/I 

Sat.  Thickness 

Aircraft 

Sat.  Wind 

360 

270 

— 

28 

24 

252 

36 

24 
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the  N  observation  locations  and  the  grid  location  k.  Except  for  the 
correlations  between  observation  and  grid  locations,  these  quantities 
can  be  computed  once  and  used  for  all  grid  locations  in  the  volume. 
This  feature  of  the  volume  method  makes  the  use  of  a  large  number  of 
observations  for  the  analysis  at  each  grid  location  computationally  feasible. 
Most  of  the  computations  involved  in  determining  the  values  of  the 
weights  are  involved  in  inverting  the  N  x  N  matrix  M  for  evaluating 
equation  (21).  In  the  NOGAPS  MVOI,  there  are  over  460,000  grid 
locations  contained  in  just  over  800  volumes.  Thus,  the  grid-point  method 
would  require  over  500  times  the  number  of  matrix  inversions  that 
are  required  for  the  volume  method.  Since  the  number  of  computations 
required  for  a  matrix  inversion  is  on  the  order  of  N 2,  the  volume  method 
permits  the  use  of  more  than  20  times  as  many  observations  for  each 
grid  location  without  increasing  the  computation  time. 

In  our  previous  discussions  of  prediction  errors  (Sections  2.2  and 
3.3),  the  relationship  between  the  height  and  wind  component  errors 
described  by  equation  (48)  assumed  a  constant  value  for  the  Coriolis 
parameter  7-  For  example,  the  values  in  Table  3  are  computed  using 
the  value  of  7  at  60°N  and  are  used  for  the  gross-error  check,  while  the 
illustration  in  Figure  3  is  based  on  the  value  of  7  at  45°.  However, 
within  the  analysis,  the  prediction  errors  are  varied  with  horizontal 
location.  For  each  volume,  the  average  value  of  the  Coriolis  parameter 
within  the  volume  is  determined,  with  the  caveat  that  the  absolute  value 
of  the  average  Coriolis  parameter  is  never  permitted  to  be  less  than  its 
absolute  value  at  30°.  For  volumes  rich  in  conventional  data,  the  wind 
component  prediction  errors  are  those  displayed  in  Table  3.  For 
data-sparse  volumes,  the  prediction  errors  for  the  wind  components  are 
increased  by  a  factor  of  1.3.  The  height  prediction  errors  for  a  volume 
are  then  computed  from  the  wind  prediction  errors  using  equation  (48). 

Even  though  all  observed  increments  in  the  volume  contribute  to  the 
analysis  at  grid-point  k,  the  weights  computed  for  some  increments  will 
be  larger  than  others,  with  some  increments  having  little  influence  on 
the  actual  outcome  at  point  k.  However,  the  same  increments  may  have 
a  much  larger  impact  on  the  analyzed  increment  at  some  other  point  in 
the  volume.  To  further  illuminate  how  the  optimum  interpolation  technique 
works,  we  will  discuss  how  the  different  statistical  quantities  affect  the 
actual  weight  received  by  the  various  observations  relative  to  the  analysis 
at  a  particular  point. 

First,  consider  the  effect  of  the  normalized  observation  error  (the 
ratio  of  observation  error  to  prediction  error).  If  two  observed  increments 
are  co-located,  the  prediction  error  is  the  same,  so  the  increment  with 
the  largest  observation  error  would  receive  the  smaller  weight.  Thus,  if 
provided  data  from  many  sources,  the  MVOI  would  give  more  weight 
to  observation  types  known  to  be  the  most  reliable.  Furthermore,  if  the 
prediction  error  is  large  compared  to  the  observation  error,  the  analysis 
will  draw  more  closely  to  the  observations.  However,  if  the  observation 
error  is  large  compared  to  the  prediction  error,  the  analyzed  increments 
will  be  smaller  and  the  resulting  analysis  will  more  closely  resemble 
the  background  field. 

Second,  consider  the  effect  of  the  prediction  error  correlation  between 
the  observation  location  and  the  grid-point  location.  Since  the 
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Control 


height-height  prediction  error  correlation  model  is  a  function  of  distance, 
in  general,  height  and  thickness  observations  that  are  farther  from  the 
grid  point  (in  three-dimensional  space)  receive  less  weight  for  the  analysis 
of  height  at  the  grid  location.  However,  for  other  data  combinations,  the 
correlation  functions  are  not  circular  (Figs.  6b-f).  For  example, 
the  functions  correlating  height  and  wind  data  are  lobed,  while  the  (mu) 
and  ( w)  functions  are  somewhat  elliptical.  Thus,  for  an  analysis  of  u  at 
a  grid  point,  the  u-wind  observations  east  and  west  of  the  grid  point 
receive  more  weight  than  those  sampled  in  the  cross-component  direction. 
For  an  analysis  of  4>  at  a  grid  point,  the  y-wind  increments  to  the 
east-west  and  the  M-wind  increments  to  the  north-south  have  the  most 
influence.  Given  the  known  relationship  between  the  height  gradient 
and  the  u-  and  rewind  components,  such  structures  make  sense  physically. 
In  the  tropics,  the  geostrophic  constraint  is  relaxed,  as  evidenced  by  the 
decreased  correlation  between  the  height  and  wind  fields  shown  in 
Figures  7b  and  c.  Increasing  the  divergence  allowed  in  the  wind  field 
decreases  the  coupling  between  the  n  and  v  components  (Fig.  7d), 
while  only  slightly  changing  the  shape  of  the  (mm)  and  ( w)  functions 
(Figs.  7e-f). 

The  final  parameter  influencing  the  weight  w,*  is  the  set  of  prediction 
error  correlations  of  the  observation  locations  with  the  other  data  locations 
in  the  volume.  Whereas  the  previous  discussion  accounted  for  the 
distribution  of  the  observations  relative  to  the  grid-point  location, 
the  prediction  error  correlations  between  observation  locations  account 
for  the  distribution  of  the  observations  relative  to  one  another.  The 
effect  of  these  terms  is  to  essentially  downweight  clusters  of  observations, 
particularly  those  of  the  same  variable  type.  For  example,  consider  two 
height  observations  equidistant  from  a  grid  point  but  on  opposite  sides. 
All  other  properties  being  equal,  the  two  observations  would  receive 
equal  weights.  However,  if  the  two  observations  were  located  on  the 
same  side  of  the  grid  point,  still  at  the  same  distance,  they  would  receive 
less  weight  than  in  the  first  case,  since  now  the  prediction  error  correlation 
of  the  two  observation  locations  is  higher  (Clancy  et  al.,  1989). 

While  considering  the  contributions  of  the  various  terms  in  isolation 
is  educational,  in  reality,  the  weight  a  particular  increment  receives 
results  from  a  complex  interaction  of  all  these  factors.  However,  it  is 
apparent  that,  unlike  earlier  objective  analysis  schemes  that  essentially 
averaged  the  nearby  data,  an  optimum  interpolation  analysis  scheme 
makes  some  very  intelligent  decisions  about  how  to  treat  the  various, 
often  conflicting,  sources  of  information  it  has  available. 


As  the  analysis  estimates  are  made  at  each  grid  location,  the  analysis 
performs  further  quality-control  checks  on  increments  that  were  flagged 
earlier  as  suspect  by  either  the  data  preprocessor  or  by  the  analysis 
gross-error  checks.  This  phase  of  the  quality  control  attempts  to  verify 
that  the  questionable  values  are  supported  by  the  other  observed 
increments.  To  accomplish  this  task,  each  flagged  increment  is  compared 
to  an  analyzed  increment  computed  at  the  observation  location  with  the 
flagged  increment  in  question  excluded.  If  these  two  values  differ  by 
more  than  the  estimated  interpolation  error  for  that  location,  the  increment 
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is  flagged  for  rejection,  and  it  is  permanently  removed  from  any  further 
consideration  by  the  analysis. 

To  perform  this  final  stage  of  quality  control,  we  first  define  the 
expected  interpolation  error  at  the  location  of  a  flagged  observation  to  be 
the  ensemble  average  ((/“  --/f)2) .  The  estimate  oMhe  interpolation 
error  is  a  function  of  the  observation  error  (e?)  ,  and  after  some 
manipulation  (Lorenc,  1981),  it  can  be  shown  that 

((/"-/fO-leff  +  l-J,  (51) 

where  wj  are  the  weights  wj  computed  for  each  observation  in  the  volume, 
modified  in  such  a  way  that  none  of  the  flagged  observations  in  the 
volume  have  any  influence  on  the  estimate.  The  nm  are  elements  of 
the  column  of  matrix  M  that  is  associated  with  the  observation  /.  The 
derivation  of  the  interpolation  error  estimate  assumes  that  all  the  other 
estimated  errors  and  correlations  used  are  perfect.  Since  this  is  obviously 
not  the  case,  we  arbitrarily  add  0.1  to  the  interpolation  error  estimate 
to  account  for  these  other  sources  of  error. 

Once  we  have  estimated  the  interpolation  error  at  each  flagged 
observation  location  from  the  right-hand  side  of  equation  (51),  we  compute 
the  analyzed  increments,  the  /f ,  at  the  same  locations,  again  not  allowing 
any  of  the  potentially  bad  observations  to  influence  the  calculation.  The 
questionable  increments  are  subtracted  from  these  analyzed  increments, 
and  the  analysis/observation  deviations  (/“  -/? )  are  compared  to  the 
expected  interpolation  errors  at  the  same  locations.  The  differences 
between  the  computed  and  expected  values  are  used  to  sort  the  flagged 
observations  so  that  the  observations  with  the  largest  differences  from 
the  expected  errors  will  be  checked  first  by  the  analysis. 

Proceeding  in  this  sorted  order,  each  flagged  observation  is  individually 
checked  against  a  prescribed  tolerance,  given  by 


TOL  =  T\ 


(52) 


2  .... 
where  T  q  is  currently  10.  The  analysis/observation  deviation  js  compared 

to  this  tolerance,  and  if  (/f  -/? )2  >  TOL,  the  observation  is  rejected.  For 

this  purpose,  the  computations  of  the  interpolation  error  estimate  and  the 

analyzed  value  at  the  location  of  the  suspected  data  do  not  include 

the  influence  of  the  observation  being  tested,  but  do  include  the  other 

flagged  observations  in  the  volume.  However,  once  an  observation  has 

been  rejected,  it  is  not  allowed  to  influence  the  decisions  made  about 

any  other  observations,  thereby  explaining  why  the  observations  are 

sorted  so  those  most  likely  in  error  are  checked  first.  Once  every 

questionable  increment  has  been  subjected  to  this  final  corroboration 

test,  the  weights  of  the  remaining  good  observations  are  recomputed 

before  the  analyzed  increments  are  determined  for  each  grid  point  in 

the  volume.  Lorenc  (1981)  describes  how  to  perform  these  computations 

efficiently. 
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5.1  NOGAPS 


5.2  Preliminary  Analysis 


The  number  of  observations  rejected  during  this  stage  of  the  analysis 
cannot  exceed  the  percentage  of  observations  that  were  flagged  for 
further  checking  by  the  previous  quality-control  algorithms.  Of  course, 
not  all  of  the  flagged  observations  will  be  rejected,  since  some  of  them 
are  undoubtedly  supported  by  other  nearby  observations.  When 
corroborated  by  other  data,  the  observed  increment  should  not  differ 
from  the  analyzed  estimate  at  the  observation  location  by  more  than  the 
allowed  tolerance.  Since  the  volumes  overlap  one  another,  however,  it 
is  possible  for  an  observation  to  be  included  in  the  analysis  within  one 
volume,  only  to  be  rejected  later  when  it  checked  against  a  different 
set  of  data.  While  this  inconsistency  is  not  really  desirable,  the 
alternative — essentially  doing  the  analysis  twice — is  too  time-consuming 
to  be  practical,  given  current  computer  resources. 


The  software  developed  to  perform  the  global  analysis  described  in 
the  previous  section  was  designed  so  that,  with  minimal  modification, 
it  could  be  utilized  to  satisfy  many  atmospheric  analysis  requirements 
at  FNOC.  In  this  section  we  will  describe  these  different  applications 
and  the  software  modifications  that  are  made  to  the  analysis  for  each 
case. 


Daily  operations  at  FNOC  are  divided  into  two  watches,  each  beginning 
1  to  2  hours  after  the  synoptic  hour.  NOGAPS  3-day  and  5-day  forecasts 
are  produced  each  day  at  1200  UTC  and  0000  UTC,  respectively,  using 
all  of  the  data  available  at  3  hours  past  the  synoptic  hour  (+3  hours). 
We  refer  to  these  products  as  the  real-time  run,  for  they  are  the  products 
distributed  operationally.  As  previously  mentioned,  NOGAPS  is  run 
with  a  6-hour  data  assimilation  cycle,  which  allows  the  inclusion  of 
observations  that  are  not  available  for  the  initial  real-time  run.  Thus,  a 
reanalysis  (post-time  analysis)  for  0000  UTC  or  1200  UTC  is  performed 
using  all  data  available  at  +8  hours.  A  6-hour  model  forecast  is  made  from 
the  post-time  analysis  and  used  as  the  background  for  the  0600  UTC  or 
1800  UTC  (off-time)  analysis,  which  is  performed  using  all  data  available 
at  +10  hours.  Then,  a  6-hour  model  forecast  is  made  from  the  off-time 
analysis  to  be  used  as  the  background  for  the  next  synoptic  hour,  for 
both  the  real-time  and  the  post-time  runs. 


Prior  to  each  real-time  run  of  NOGAPS,  a  global  low-level  analysis 
is  run  using  all  data  available  at  +2  hours.  During  this  run,  the  MVOI 
analysis  described  in  Section  4  is  performed  at  the  four  pressure  levels 
from  1000  mb  to  700  mb,  inclusive.  All  data  types  are  used  for  this 
analysis  except  for  aircraft  reports,  satellite-derived  temperature  soundings, 
and  SSM/I  surface  wind  speeds.  This  analysis  is  used  by  FNOC  personnel 
to  determine  areas  where  synthetic  surface  observations  should  be 
generated  and  entered  into  the  NOGAPS  data  base  prior  to  the  real-time 
run.  It  is  also  used  to  provide  wind  directions  for  the  SSM/I  wind  speed 
observations  prior  to  their  ingest  into  the  real-time  NOGAPS  analysis. 


MVOI  Analysis  of  Meteorological  Data  at  FNOC 


53 


5.3  Limited-Area  Analyses 


5.4  HORAPS 


5.5  Stratospheric  Analysis 


The  NOGAPS  analysis  has  also  been  modified  so  that,  if  desired,  the 
analysis  can  be  performed  only  for  volumes  covering  a  specified  area, 
but  still  using  the  background  field  from  the  global  forecast  model. 
This  limited-area  analysis  requires  only  a  small  fraction  of  the  time  that 
it  takes  to  run  the  full  global  analysis,  and  it  is  used  as  a  tool  by  the 
FNOC  personnel  responsible  for  the  generation  of  synthetic  surface 
observations  in  the  vicinity  of  extratropical  cyclones.  Prior  to  admitting 
these  observations  into  the  operational  database,  the  limited-area  analysis 
is  run  to  ensure  that  the  observations  they  have  created  will  have  the 
desired  effect  upon  the  NOGAPS  analysis.  Since  the  analysis  at  any 
grid  point  is  independent  of  the  results  at  any  other  grid  point,  given 
the  same  database,  the  limited-area  analysis  can  exactly  duplicate  the 
global  MVOI  in  a  selected  region. 


To  provide  higher  resolution,  more  localized  information,  the  Navy 
Operational  Regional  Atmospheric  Prediction  System  (NORAPS)  is  run 
for  four  different  areas  each  watch.  The  regional  analyses  differ  from 
the  limited-area  analyses,  since  NORAPS  is  a  complete  data  assimilation 
system  that  cycles  with  its  own  analysis  and  forecast  model.  The  NORAPS 
forecast  model  is  a  relocatable  grid-point  model  (Hodur,  1987),  which 
is  run  with  100-krn  resolution  for  the  Western  Atlantic  and  Western 
Pacific  areas,  with  80-km  resolution  for  the  Mediterranean  area,  and 
with  40-km  resolution  for  the  Persian  Gulf  area.  For  each  area,  the 
MVOI  analysis  fields  are  produced  on  the  forecast  grid  of  the  model  at 
the  same  16  pressure  levels  used  by  the  global  analysis.  The  analysis 
background  fields  are  12-hour  forecasts  valid  at  the  analysis  time  made 
by  the  NORAPS  forecast  model.  The  observational  data  base  and  data 
selection  strategy  i:;  the  same  as  that  used  by  the  global  analysis.  Since 
NORAPS  has  been  designed  so  that  it  can  be  located  anywhere  in  the 
world,  the  variable-size  volumes,  such  as  those  used  in  NOGAPS,  are 
not  practical.  Thus,  the  areal  extent  of  each  NORAPS  analysis  volume 
is  an  approximately  1200-km  square  that  is  permitted  to  expand  if  it 
does  not  contain  a  minimum  of  60  observations.  The  maximum  number 
of  observations  allowed  in  a  NORAPS  volume  is  300. 


To  further  satisfy  Navy  requirements,  FNOC  produces  a  stratospheric 
analysis  each  day  for  0000  UTC  and  1200  UTC.  The  differences  between 
this  analysis  and  the  NOGAPS  analysis  are  outlined  here.  The  stratospheric 
analysis  is  performed  on  a  2.5°  x  2.5°  spherical  grid  for  twelve  pressure 
levels  from  200  mb  to  0.4  mb.  The  analysis  levels  are  the  standard 
pressure  levels  from  200  mb  to  10  mb,  inclusive,  along  with  5  mb, 
2  mb,  1  mb,  and  0.4  mb.  The  background  fields  utilized  by  the  stratospheric 
analysis  are  the  same  as  those  used  by  the  NOGAPS  analysis  for  levels 
up  to  10  mb.  For  the  levels  above  10  mb,  the  background  fields  are  the 
fields  from  the  previous  analysis  (12-hour  prior)  adjusted  by  the  difference 
between  the  10-mb  background  fields  and  the  10-mb  fields  from  the 
previous  analysis.  That  is,  we  assume  that  the  magnitudes  of  the  analyzed 
4>,  u,  and  ('corrections  are  constant  with  height  above  10  mb  at  a  given 
location,  and  adjust  the  background  fields  accordingly  before  performing 
the  next  analysis.  The  prediction  error  estimates  for  these  upper  levels 
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are  increased  considerably  to  reflect  the  fact  that  these  fields  are  not  as 
accurate  as  the  model-controlled  background  fields.  Thus,  the  resulting 
analyses  at  the  upper  levels  are  more  greatly  influenced  by  the 
observational  data  than  by  the  background  fields. 

♦  The  only  types  of  observations  used  in  the  stratospheric  analysis  are 

radiosondes  and  satellite-derived  temperature  soundings.  Virtually  the 
only  data  available  at  levels  above  10  mb  are  the  thickness  observations 
derived  from  the  two  NOAA  polar-orbiting  satellites.  To  improve  the 
data  coverage  for  the  analysis,  satellite  observations  are  utilized  for  a 

£  12-hour  time  window  beginning  9  hours  before  the  analysis  time.  The 

radiosondes  used  are  those  valid  at  the  analysis  time.  Since  the  global 
distribution  of  data  above  200  mb  is  relatively  uniform,  there  is  no  need 
for  the  data-dependent  analysis  volume  scheme  described  in  the  previous 
section  for  the  NOGAPS  analysis.  Thus,  for  the  stratospheric  analysis, 
the  areal  extent  of  each  volume,  excluding  the  polar  caps,  is  approximately 

•  a  2000-km  square. 


5.6  Shipboard  Analyses  The  MVOl  analysis  has  also  been  adapted  for  shipboard  use  on  the 

Tactical  Environmental  Support  System  (TESS(3)).  This  new  hardware 
^  will  allow  ships  at  sea  to  receive  and  process  observations,  satellite 

imagery,  and  FNOC  environmental  products.  The  sensor  package 
accompanying  TESS(3)  gives  ships  the  capability  to  automatically  sample 
and  process  their  own  observations.  With  resident  applications  programs, 
such  as  the  MVOI,  information  from  these  various  sources  can  be 
combined  to  produce  updated,  localized  analyses  in  support  of  shipboard 
#  and  tactical  operations. 


6.0  Summary  Since  its  operational  implementation  at  FNOC  in  January  1988,  the 

MVOI  analysis  has  become  the  workhorse  in  satisfying  the  Navy’s 
®  many  and  diverse  atmospheric  analysis  requirements.  With  only  relatively 

minor  modifications,  the  same  software  is  used  to  perform  the  NOGAPS 
global  analysis,  the  different  NORAPS  regional  analyses,  the  global 
upper  stratospheric  analysis,  the  global  low-level  analysis,  and 
limited-area  and  shipboard  analyses  wherever  they  are  needed.  Not  only 
0  does  this  simplify  software  maintenance,  but  it  also  allows  all  of  the 

Navy’s  operational  analyses  to  simultaneously  benefit  from  improvements 
realized  through  the  latest  research  and  development  efforts. 

The  MVOI  analysis  has  proven  to  be  an  especially  effective  and  flexible 
vehicle  for  the  assimilation  of  new  types  of  observational  and  synthetically 
generated  data  into  the  Navy’s  atmospheric  prediction  systems.  The 

*  incorporation  of  a  new  type  of  observation  into  the  analysis  is  relatively 

straightforward,  once  the  error  properties  and  the  spatial  distribution  of 
the  data  have  been  determined.  This  flexibility  has  been  demonstrated 
by  the  operational  implementation  of  the  DMSP  SSM/I  wind  speed  data 
and  several  types  of  synthetic  observations.  For  example,  the  synoptic 

•  skills  of  FNOC  personnel  are  utilized  to  generate  synthetic  surface 

observations  in  the  vicinity  of  oceanic  extratropical  cyclones  that  are 
poorly  depicted  by  NOGAPS.  Before  admitting  these  observations  into 
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